date:20070724

Re: -mm merge plans for 2.6.23

2007-07-24 Thread david

On Wed, 25 Jul 2007, Nick Piggin wrote:

Eric St-Laurent wrote:

 On Wed, 2007-25-07 at 06:55 +0200, Rene Herman wrote:

> It certainly doesn't run for me ever. Always kind of a "that's not the 
> point" comment but I just keep wondering whenever I see anyone complain 
> about updatedb why the _hell_ they are running it in the first place. If 
> anyone who never uses "locate" for anything simply disable updatedb, the 
> problem will for a large part be solved.
> 
> This not just meant as a cheap comment; while I can think of a few 
> similar loads even on the desktop (scanning a browser cache, a media 
> player indexing a large amount of media files, ...) I've never heard of 
> problems _other_ than updatedb. So just junk that crap and be happy.

>From my POV there's two different problems discussed recently:

 - updatedb type of workloads that add tons of inodes and dentries in the
 slab caches which of course use the pagecache.

 - streaming large files (read or copying) that fill the pagecache with
 useless used-once data

 swap prefetch fix the first case, drop-behind fix the second case.

OK, this is where I start to worry. Swap prefetch AFAIKS doesn't fix
the updatedb problem very well, because if updatedb has caused swapout
then it has filled memory, and swap prefetch doesn't run unless there
is free memory (not to mention that updatedb would have paged out other
files as well).

And drop behind doesn't fix your usual problem where you are downloading
from a server, because that is use-once write(2) data which is the
problem. And this readahead-based drop behind also doesn't help if data
you were reading happened to be a sequence of small files, or otherwise
not in good readahead order.

Not to say that neither fix some problems, but for such conceptually
big changes, it should take a little more effort than a constructed test
case and no consideration of the alternatives to get it merged.

well, there appears to be a fairly large group of people who have 
subjective opinions that it helps them. but those were dismissed becouse 
they aren't measurements.

so now the measurements of the constructed test case aren't acceptable.

what sort of test case would be acceptable?

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread david

On Wed, 25 Jul 2007, Rene Herman wrote:

On 07/25/2007 07:12 AM, [EMAIL PROTECTED] wrote:

 On Wed, 25 Jul 2007, Rene Herman wrote:

>  It certainly doesn't run for me ever. Always kind of a "that's not the 
>  point" comment but I just keep wondering whenever I see anyone complain 
>  about updatedb why the _hell_ they are running it in the first place. If 
>  anyone who never uses "locate" for anything simply disable updatedb, the 
>  problem will for a large part be solved.
> 
>  This not just meant as a cheap comment; while I can think of a few 
>  similar loads even on the desktop (scanning a browser cache, a media 
>  player indexing a large amount of media files, ...) I've never heard of 
>  problems _other_ than updatedb. So just junk that crap and be happy.

 but if you do use locate then the alturnative becomes sitting around and
 waiting for find to complete on a regular basis.

Yes, but what's locate's usage scenario? I've never, ever wanted to use it. 
When do you know the name of something but not where it's located, other than 
situations which "which" wouldn't cover and after just having 
installed/unpacked something meaning locate doesn't know about it yet either?

which only finds executables that are in the path.

I commonly use locate to find config files (or sample config files) for 
packages that were installed at some point in the past with fairly default 
configs and now I want to go and tweak them. so I start reading 
documentation and then need to find out where $disto moved the files to 
this release (I commonly am working on machines with over a half dozen 
different distro releases, and none of them RedHat)

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1 regression: mm: fix fault vs invalidate race for linear mappings

2007-07-24 Thread Nick Piggin


Dave Airlie wrote:


Is this with a binary-only module? We saw an issue with that in SLES9
where the module is returning a locked page from its nopage handler
when it isn't really supposed to. It might be fixed in latest drivers,
have you tried them?



Doesn't sound like it he mentions radeon drm module which is open...


OK. Well from the task trace, X is getting stuck on a locked page. And
as it is never getting unlocked, I'd be almost positive it comes from
a driver's nopage. I know some ATI driver did that in the past. Would
the radeon drm module do anything similar?

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Print utsname on Oops on all architectures

2007-07-24 Thread Andrew Morton

On Thu, 5 Jul 2007 18:52:27 -0700 (PDT) Joshua Wise <[EMAIL PROTECTED]> wrote:

> Background:
>  This patch is a follow-on to "Info dump on Oops or panic()" [1].
>  
>  On some architectures, the kernel printed some information on the running
>  kernel, but not on all architectures. The information printed was generally
>  the version and build number, but it was not located in a consistant place,
>  and some architectures did not print it at all.
> 
> Description:
>  This patch uses the already-existing die_chain to print utsname information
>  on Oops. This patch also removes the architecture-specific utsname
>  printers. To avoid crashing the system further (and hence not printing the
>  Oops) in the case where the system is so hopelessly smashed that utsname
>  might be destroyed, we vsprintf the utsname data into a static buffer
>  first, and then just print that on crash.
> 
> Testing:
>  I wrote a module that does a *(int*)0 = 0; and observed that I got my
>  utsname data printed.
> 
> Potential impact:
>  This adds another line to the Oops output, causing the first few lines to
>  potentially scroll off the screen. This also adds a few more pointer
>  dereferences in the Oops path, because it adds to the die_chain notifier
>  chain, reducing the likelihood that the Oops will be printed if there is
>  very bad memory corruption.

There are strange happenings due to this patch on i386:

Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
hdc: max request size: 128KiB
hdc: 156355584 sectors (80054 MB) w/1819KiB Cache, CHS=65535/16/63, UDMA(33)
hdc: cache flushes supported
 hdc:<0>Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
 hdc1 hdc2 hdc3 hdc4 <<0>Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 
2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
 hdc5<0>Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
 hdc6 >
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
initcall 0xc052a060: idedisk_init+0x0/0x10() returned 0.
initcall 0xc052a060 ran for 38 msecs: idedisk_init+0x0/0x10()
Calling initcall 0xc052a070: ide_cdrom_init+0x0/0x10()
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
Linux 2.6.23-rc1-mm1 #7 SMP Tue Jul 24 22:34:40 PDT 2007 i686
initcall 0xc052a070: ide_cdrom_init+0x0/0x10() returned 0.
initcall 0xc052a070 ran for 3 msecs: ide_cdrom_init+0x0/0x10()
Calling initcall 0xc052a080: idetape_init+0x0/0x90()
initcall 0xc052a080: idetape_init+0x0/0x90() returned 0.
initcall 0xc052a080 ran for 0 msecs: idetape_init+0x0/0x90()
Calling initcall 0xc052a110: idefloppy_init+0x0/0x20()
ide-floppy driver 0.99.newide
initcall 0xc052a110: idefloppy_init+0x0/0x20()

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Nick Piggin


Eric St-Laurent wrote:

On Wed, 2007-25-07 at 06:55 +0200, Rene Herman wrote:


It certainly doesn't run for me ever. Always kind of a "that's not the 
point" comment but I just keep wondering whenever I see anyone complain 
about updatedb why the _hell_ they are running it in the first place. If 
anyone who never uses "locate" for anything simply disable updatedb, the 
problem will for a large part be solved.


This not just meant as a cheap comment; while I can think of a few similar 
loads even on the desktop (scanning a browser cache, a media player indexing 
a large amount of media files, ...) I've never heard of problems _other_ 
than updatedb. So just junk that crap and be happy.




From my POV there's two different problems discussed recently:


- updatedb type of workloads that add tons of inodes and dentries in the
slab caches which of course use the pagecache.

- streaming large files (read or copying) that fill the pagecache with
useless used-once data

swap prefetch fix the first case, drop-behind fix the second case.


OK, this is where I start to worry. Swap prefetch AFAIKS doesn't fix
the updatedb problem very well, because if updatedb has caused swapout
then it has filled memory, and swap prefetch doesn't run unless there
is free memory (not to mention that updatedb would have paged out other
files as well).

And drop behind doesn't fix your usual problem where you are downloading
from a server, because that is use-once write(2) data which is the
problem. And this readahead-based drop behind also doesn't help if data
you were reading happened to be a sequence of small files, or otherwise
not in good readahead order.

Not to say that neither fix some problems, but for such conceptually
big changes, it should take a little more effort than a constructed test
case and no consideration of the alternatives to get it merged.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Rene Herman


On 07/25/2007 07:12 AM, [EMAIL PROTECTED] wrote:


On Wed, 25 Jul 2007, Rene Herman wrote:


It certainly doesn't run for me ever. Always kind of a "that's not the 
point" comment but I just keep wondering whenever I see anyone 
complain about updatedb why the _hell_ they are running it in the 
first place. If anyone who never uses "locate" for anything simply 
disable updatedb, the problem will for a large part be solved.


This not just meant as a cheap comment; while I can think of a few 
similar loads even on the desktop (scanning a browser cache, a media 
player indexing a large amount of media files, ...) I've never heard 
of problems _other_ than updatedb. So just junk that crap and be happy.


but if you do use locate then the alturnative becomes sitting around and 
waiting for find to complete on a regular basis.


Yes, but what's locate's usage scenario? I've never, ever wanted to use it. 
When do you know the name of something but not where it's located, other 
than situations which "which" wouldn't cover and after just having 
installed/unpacked something meaning locate doesn't know about it yet either?


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Eric St-Laurent

On Wed, 2007-25-07 at 06:55 +0200, Rene Herman wrote:

> It certainly doesn't run for me ever. Always kind of a "that's not the 
> point" comment but I just keep wondering whenever I see anyone complain 
> about updatedb why the _hell_ they are running it in the first place. If 
> anyone who never uses "locate" for anything simply disable updatedb, the 
> problem will for a large part be solved.
> 
> This not just meant as a cheap comment; while I can think of a few similar 
> loads even on the desktop (scanning a browser cache, a media player indexing 
> a large amount of media files, ...) I've never heard of problems _other_ 
> than updatedb. So just junk that crap and be happy.

>From my POV there's two different problems discussed recently:

- updatedb type of workloads that add tons of inodes and dentries in the
slab caches which of course use the pagecache.

- streaming large files (read or copying) that fill the pagecache with
useless used-once data

swap prefetch fix the first case, drop-behind fix the second case.

Both have the same symptoms but the cause is different.

Personally updatedb doesn't really hurt me.  But I don't have that many
files on my desktop.  I've tried the swap prefetch patch in the past and
it was not so noticeable for me. (I don't doubt it's helpful for others)

But every time I read or copy a large file around (usually from a
server) the slowdown is noticeable for some moments.

I just wanted to point this out, if it wasn't clean enough for everyone.
I hope both problems get fixed.

Best regards,

- Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] fs/super.c: Why alloc_super use a static variable default_op?

2007-07-24 Thread rae l

On 7/25/07, Al Viro <[EMAIL PROTECTED]> wrote:

On Wed, Jul 25, 2007 at 12:29:17PM +0800, rae l wrote:
> But is it valuable? Compared to a waste of sizeof(struct super_block)
> bytes memory.

It's less that struct super_block, actually.

> When some code want to refer fs_type->s_op, it almost always want to
> refer some function pointer in s_op with fs_type->s_op->***, but all
> pointers in default_op are all NULLs, what about this scenario?

Yes, and?  You still need one test instead of two.  Which gets you
more than 21 words used by that sucker, only in .text instead of .bss.

> and if you do grep s_op in the source code, you will found nowhere
> will want to test s_op or dependent on s_op not NULL.

What?  fs/inode.c:
if (sb->s_op->alloc_inode)
inode = sb->s_op->alloc_inode(sb);
else
inode = (struct inode *) kmem_cache_alloc(inode_cachep, 
GFP_KERNEL);
and the same goes everywhere else.  Of course we don't check for
sb->s_op not being NULL - that's exactly why we are safe skipping such
tests.

Oh, Thank you.

But there are also many other subsystems will do
fs/dcache.c:
void dput(struct dentry *dentry)
if (dentry->d_op && dentry->d_op->d_delete) {
Do you think it's worth optimizing it with a static d_op filled?

we can add a static variable to d_alloc and set its initial d_op to
this static variable?
struct dentry *d_alloc(struct dentry * parent, const struct qstr *name)

--
Denis Cheng
Linux Application Developer

"One of my most productive days was throwing away 1000 lines of code."
- Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] readahead drop behind and size adjustment

2007-07-24 Thread Nick Piggin


Eric St-Laurent wrote:

On Mon, 2007-23-07 at 19:00 +1000, Nick Piggin wrote:



I don't like this kind of conditional information going from something
like readahead into page reclaim. Unless it is for readahead _specific_
data such as "I got these all wrong, so you can reclaim them" (which
this isn't).

But I don't like it as a use-once thing. The VM should be able to get
that right.





Question: How work the use-once code in the current kernel? Is there
any? I doesn't quite work for me...


What *I* think is supposed to happen is that newly read in pages get
put on the inactive list, and unless they get accessed againbefore
being reclaimed, they are allowed to fall off the end of the list
without disturbing active data too much.

I think there is a missing piece here, that we used to ease the reclaim
pressure off the active list when the inactive list grows relatively
much larger than it (which could indicate a lot of use-once pages in
the system).

Andrew got rid of that logic for some reason which I don't know, but I
can't see that use-once would be terribly effective today (so your
results don't surprise me too much).

I think I've been banned from touching vmscan.c, but if you're keen to
try a patch, I might be convinced to come out of retirement :)


See my previous email today, I've done a small test case to demonstrate 
the problem and the effectiveness of Peter's patch.  The only piece

missing is the copy case (read once + write once).

Regardless of how it's implemented, I think a similar mechanism must be
added. This is a long standing issue.

In the end, I think it's a pagecache resources allocation problem. the
VM lacks fair-share limits between processes. The kernel doesn't have
enough information to make the right decisions.

You can refine or use more advanced page reclaim, but some fair-share
splitting (like the CPU scheduler) between the processes must be
present.  Of course some process should have large or unlimited VM
limits, like databases.

Maybe the "containers" patchset and memory controller can help.  With
some specific configuration and/or a userspace daemon to adjust the
limits on the fly.

Independently, the basic large file streaming read (or copy) once cases
should not trash the pagecache. Can we agree on that?


One man's trash is another's treasure: some people will want the
files to remain in cache because they'll use them again (copy it
somewhere else, or start editing it after being copied or whatever).

But yeah, we can probably do better at the sequential read/write
case.



I say, let's add some code to fix the problem.  If we hear about any
regression in some workloads, we can add a tunable to limit or disable
its effects, _if_ a better compromised solution cannot be found.


Sure, but let's figure out the workloads and look at all the
alternatives first.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86_64 tce section mismatch

2007-07-24 Thread Muli Ben-Yehuda

On Tue, Jul 24, 2007 at 02:17:02PM -0700, Randy Dunlap wrote:
> From: Randy Dunlap <[EMAIL PROTECTED]>
> 
> Fix section mismatch warnings:
> these functions are called only from __init functions.
> 
> WARNING: vmlinux.o(.text+0x1861c): Section mismatch: reference to 
> .init.text:free_bootmem (between 'free_tce_table' and 'build_tce_table')
> WARNING: vmlinux.o(.text+0x187e5): Section mismatch: reference to 
> .init.text:__alloc_bootmem_low (between 'alloc_tce_table' and 
> 'kretprobe_trampoline_holder')
> 
> Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>

At some point in time we will need to support hotplug with IOMMU
translation enabled, in which case they'll be called when hotplug
happens as well, but in the mean time

Signed-off-by: Muli Ben-Yehuda <[EMAIL PROTECTED]>

I'll push it with the next Calgary update unless Andi picks it up
first.

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread david

On Wed, 25 Jul 2007, Rene Herman wrote:

On 07/25/2007 06:06 AM, Nick Piggin wrote:

 Ray Lee wrote:

>  Anyway, my point is that I worry that tuning for an unusual and 
>  infrequent workload (which updatedb certainly is), is the wrong way to 
>  go.

 Well it runs every day or so for every desktop Linux user, and it has
 similarities with other workloads.

It certainly doesn't run for me ever. Always kind of a "that's not the point" 
comment but I just keep wondering whenever I see anyone complain about 
updatedb why the _hell_ they are running it in the first place. If anyone who 
never uses "locate" for anything simply disable updatedb, the problem will 
for a large part be solved.

This not just meant as a cheap comment; while I can think of a few similar 
loads even on the desktop (scanning a browser cache, a media player indexing 
a large amount of media files, ...) I've never heard of problems _other_ than 
updatedb. So just junk that crap and be happy.

but if you do use locate then the alturnative becomes sitting around and 
waiting for find to complete on a regular basis.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] getting rid of stupid loop in BUG()

2007-07-24 Thread H. Peter Anvin

Keith Owens wrote:
> Trent Piepho (on Tue, 24 Jul 2007 19:31:36 -0700 (PDT)) wrote:
>> Adding __builtin_trap after the
>> asm might be an ok fix.  It will emit a spurious int 6, but that won't even 
>> be
>> reached since the asm doesn't return, and it probably be less extra code than
>> the loop.
> 
> int 6 is a two byte instruction, the loop generates jmp with an 8 bit
> offset, also two bytes.  No change in code size.
> 

INT 6 is #UD, so the __builtin_trap() replaces the ud2a as well as the loop.

How far back was __builtin_trap() supported?

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] firewire: mass-storage i/o-problems

2007-07-24 Thread Manuel Lauss

On Tue, Jul 24, 2007 at 09:56:59PM +0200, Stefan Richter wrote:
> Manuel Lauss wrote:
> > Actually, copying data to the disk while playing/seeking through a moviefile
> > which is also located on it is already enough. Forget the NFS thing...
> > 
> > Afterwards the firewire_sbp2 module has to be rmmod-ed and modprobed again
> > or it will continue to throw errors even for single reads.
> > 
> > I hope this helps tracking it down...
> 
> I tried this and similar tests on my main PC (PCIe based) and on an
> Athlon/KM266 PC, with 1394b and 1394a hardware.  Nothing happened,
> except for a single "status write for unknown orb", followed by command
> abort from which the disk immediately recovered.  I did many tests and
> it didn't happen again.  I.e. it's probable that the supposed bug
> happens here too, but very rarely.

I tried 2.6.23 in the meantime, it's *MUCH* harder to trigger; in fact
I had to skip through movies for ~10 minutes to get the orb timeout.
The disk was inaccessible for a few seconds then recovered fine.

> Could you (and everyone else who has repeated I/O errors with the new
> drivers, but not with the old drivers) test the attached patches, one
> patch at a time?  They apply to 2.6.22.

Will do.

Thanks,
Manuel Lauss
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Mike Galbraith

On Tue, 2007-07-24 at 11:25 -0700, Linus Torvalds wrote:
> 
> On Tue, 24 Jul 2007, Andrew Morton wrote:
> > 
> > I guess this was the bug:
> 
> Looks very likely to me. Mike, Alexey, does this fix things for you?

I don't have very much runtime on it yet, but yes, it seems to have.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH]: allow individual core dump methods to be unlimited when sending to a pipe

2007-07-24 Thread Andrew Morton


SuperH allmodconfig broke:

fs/binfmt_flat.c:83: warning: initialization from incompatible pointer type
fs/binfmt_flat.c:94: error: conflicting types for 'flat_core_dump'
fs/binfmt_flat.c:78: error: previous declaration of 'flat_core_dump' was here
fs/binfmt_flat.c:94: error: conflicting types for 'flat_core_dump'
fs/binfmt_flat.c:78: error: previous declaration of 'flat_core_dump' was here
fs/binfmt_flat.c: In function `decompress_exec':
fs/binfmt_flat.c:293: warning: label `out' defined but not used
fs/binfmt_flat.c: In function `load_flat_file':
fs/binfmt_flat.c:462: warning: unsigned int format, long int arg (arg 3)
fs/binfmt_flat.c:462: warning: unsigned int format, long int arg (arg 4)
fs/binfmt_flat.c:518: warning: comparison of distinct pointer types lacks a cast
fs/binfmt_flat.c:549: warning: passing arg 1 of `ksize' makes pointer from 
integer without a cast
fs/binfmt_flat.c:601: warning: passing arg 1 of `ksize' makes pointer from 
integer without a cast
fs/binfmt_flat.c: At top level:
fs/binfmt_flat.c:78: warning: 'flat_core_dump' used but never defined
fs/binfmt_flat.c:94: warning: 'flat_core_dump' defined but not used
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix corruption of memmap on IA64 SPARSEMEM when mem_section is not a power of 2

2007-07-24 Thread Andrew Morton

On Tue, 13 Mar 2007 10:42:02 + [EMAIL PROTECTED] (Mel Gorman) wrote:

> There are problems in the use of SPARSEMEM and pageblock flags that causes
> problems on ia64.
> 
> The first part of the problem is that units are incorrect in
> SECTION_BLOCKFLAGS_BITS computation. This results in a map_section's
> section_mem_map being treated as part of a bitmap which isn't good. This
> was evident with an invalid virtual address when mem_init attempted to free
> bootmem pages while relinquishing control from the bootmem allocator.
> 
> The second part of the problem occurs because the pageblock flags bitmap is
> be located with the mem_section. The SECTIONS_PER_ROOT computation using
> sizeof (mem_section) may not be a power of 2 depending on the size of the
> bitmap. This renders masks and other such things not power of 2 base. This
> issue was seen with SPARSEMEM_EXTREME on ia64. This patch moves the bitmap
> outside of mem_section and uses a pointer instead in the mem_section. The
> bitmaps are allocated when the section is being initialised.
> 
> Note that sparse_early_usemap_alloc() does not use alloc_remap() like
> sparse_early_mem_map_alloc(). The allocation required for the bitmap on x86,
> the only architecture that uses alloc_remap is typically smaller than a cache
> line. alloc_remap() pads out allocations to the cache size which would be
> a needless waste.
> 
> Credit to Bob Picco for identifying the original problem and effecting a
> fix for the SECTION_BLOCKFLAGS_BITS calculation. Credit to Andy Whitcroft
> for devising the best way of allocating the bitmaps only when required for
> the section.

SuperH allmodconfig blew up:

mm/sparse.c: In function `sparse_init':
mm/sparse.c:482: error: implicit declaration of function 
`sparse_early_usemap_alloc'
mm/sparse.c:482: warning: assignment makes pointer from integer without a cast
mm/sparse.c: In function `sparse_add_one_section':
mm/sparse.c:553: error: implicit declaration of function 
`__kmalloc_section_usemap'
mm/sparse.c:553: warning: assignment makes pointer from integer without a cast
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Nick Piggin


Rene Herman wrote:

On 07/25/2007 06:06 AM, Nick Piggin wrote:


Ray Lee wrote:



Anyway, my point is that I worry that tuning for an unusual and 
infrequent workload (which updatedb certainly is), is the wrong way 
to go.



Well it runs every day or so for every desktop Linux user, and it has
similarities with other workloads.



It certainly doesn't run for me ever. Always kind of a "that's not the 
point" comment but I just keep wondering whenever I see anyone complain 
about updatedb why the _hell_ they are running it in the first place. If 
anyone who never uses "locate" for anything simply disable updatedb, the 
problem will for a large part be solved.


This not just meant as a cheap comment; while I can think of a few 
similar loads even on the desktop (scanning a browser cache, a media 
player indexing a large amount of media files, ...) I've never heard of 
problems _other_ than updatedb. So just junk that crap and be happy.


OK fair point, but the counter point that there are real patterns
that just use-once a lot of metadata (ls, for example. grep even.)

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Rene Herman


On 07/25/2007 06:06 AM, Nick Piggin wrote:


Ray Lee wrote:


Anyway, my point is that I worry that tuning for an unusual and 
infrequent workload (which updatedb certainly is), is the wrong way to 
go.


Well it runs every day or so for every desktop Linux user, and it has
similarities with other workloads.


It certainly doesn't run for me ever. Always kind of a "that's not the 
point" comment but I just keep wondering whenever I see anyone complain 
about updatedb why the _hell_ they are running it in the first place. If 
anyone who never uses "locate" for anything simply disable updatedb, the 
problem will for a large part be solved.


This not just meant as a cheap comment; while I can think of a few similar 
loads even on the desktop (scanning a browser cache, a media player indexing 
a large amount of media files, ...) I've never heard of problems _other_ 
than updatedb. So just junk that crap and be happy.


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/8] i386: bitops: Kill volatile-casting of memory addresses

2007-07-24 Thread Nick Piggin


Linus Torvalds wrote:


On Tue, 24 Jul 2007, Benjamin Herrenschmidt wrote:



Besides, as Nick pointed out, it prevents some valid optimizations.



No it doesn't. Not the ones on the functions that just do an inline asm.

The only valid optimization it might break is for "constant_test_bit()", 
which isn't even using inline asm.


The constant case is probably most used (at least for page flags), and
is most important for me. constant_test_bit may not be using inline asm,
but the volatile pointer target means that it reloads the value and can't
do much optimisation over it.

BTW. once volatile goes away, i386 really should start using the C
versions of __set_bit and __clear_bit as well IMO. (at least for the
constant bitnr case), so gcc can potentially optimise better.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] oom: print points as unsigned long

2007-07-24 Thread David Rientjes

In badness(), the automatic variable 'points' is unsigned long.  Print it
as such.

Signed-off-by: David Rientjes <[EMAIL PROTECTED]>
---
 mm/oom_kill.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -156,7 +156,7 @@ unsigned long badness(struct task_struct *p, unsigned long 
uptime)
}
 
 #ifdef DEBUG
-   printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
+   printk(KERN_DEBUG "OOMkill: task %d (%s) got %lu points\n",
p->pid, p->comm, points);
 #endif
return points;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread david


On Tue, 24 Jul 2007, Ray Lee wrote:


On 7/23/07, Nick Piggin <[EMAIL PROTECTED]> wrote:

 Ray Lee wrote:



 Looking at your past email, you have a 1GB desktop system and your
 overnight updatedb run is causing stuff to get swapped out such that
 swap prefetch makes it significantly better. This is really
 intriguing to me, and I would hope we can start by making this
 particular workload "not suck" without swap prefetch (and hopefully
 make it even better than it currently is with swap prefetch because
 we'll try not to evict useful file backed pages as well).


updatedb is an annoying case, because one would hope that there would
be a better way to deal with that highly specific workload. It's also
pretty stat dominant, which puts it roughly in the same category as a
git diff. (They differ in that updatedb does a lot of open()s and
getdents on directories, git merely does a ton of lstat()s instead.)

Anyway, my point is that I worry that tuning for an unusual and
infrequent workload (which updatedb certainly is), is the wrong way to
go.


updatedb pushing out program data may be able to be improved on with drop 
behind or similar.


however another scenerio that causes a similar problem is when a user is 
busy useing one of the big memory hogs and then switches to another (think 
switching between openoffice and firefox)



 After that we can look at other problems that swap prefetch helps
 with, or think of some ways to measure your "whole day" scenario.

 So when/if you have time, I can cook up a list of things to monitor
 and possibly a patch to add some instrumentation over this updatedb
 run.


That would be appreciated. Don't spend huge amounts of time on it,
okay? Point me the right direction, and we'll see how far I can run
with it.


you could make a synthetic test by writing a memory hog that allocates 3/4 
of your ram then pauses waiting for input and then randomly accesses the 
memory for a while (say randomly accessing 2x # of pages allocated) and 
then pausing again before repeating


run two of these, alternating which one is running at any one time. time 
how long it takes to do the random accesses.


the difference in this time should be a fair example of how much it would 
impact the user.


by the way, I've also seen comments on the Postgres performance mailing 
list about how slow linux is compared to other OS's in pulling data back 
in that's been pushed out to swap (not a factor on dedicated database 
machines, but a big factor on multi-purpose machines)



 Anyway, I realise swap prefetching has some situations where it will
 fundamentally outperform even the page replacement oracle. This is
 why I haven't asked for it to be dropped: it isn't a bad idea at all.





 However, if we can improve basic page reclaim where it is obviously
 lacking, that is always preferable. eg: being a highly speculative
 operation, swap prefetch is not great for power efficiency -- but we
 still want laptop users to have a good experience as well, right?


Absolutely. Disk I/O is the enemy, and the best I/O is one you never
had to do in the first place.


almost always true, however there is some amount of I/O that is free with 
todays drives (remember, they read the entire track into ram and then 
give you the sectors on the track that you asked for). and if you have a 
raid array this is even more true.


if you read one sector in from a raid5 array you have done all the same 
I/O that you would have to do to read in the entire stripe, but I don't 
believe that the current system will keep it all around if it exceeds the 
readahead limit.


so in many cases readahead may end up being significantly cheaper then you 
expect.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1 regression: mm: fix fault vs invalidate race for linear mappings

2007-07-24 Thread Dave Airlie



Is this with a binary-only module? We saw an issue with that in SLES9
where the module is returning a locked page from its nopage handler
when it isn't really supposed to. It might be fixed in latest drivers,
have you tried them?


Doesn't sound like it he mentions radeon drm module which is open...

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] fs/super.c: Why alloc_super use a static variable default_op?

2007-07-24 Thread Al Viro

On Wed, Jul 25, 2007 at 12:29:17PM +0800, rae l wrote:
> But is it valuable? Compared to a waste of sizeof(struct super_block)
> bytes memory.

It's less that struct super_block, actually.

> When some code want to refer fs_type->s_op, it almost always want to
> refer some function pointer in s_op with fs_type->s_op->***, but all
> pointers in default_op are all NULLs, what about this scenario?

Yes, and?  You still need one test instead of two.  Which gets you
more than 21 words used by that sucker, only in .text instead of .bss.

> and if you do grep s_op in the source code, you will found nowhere
> will want to test s_op or dependent on s_op not NULL.

What?  fs/inode.c:
if (sb->s_op->alloc_inode)
inode = sb->s_op->alloc_inode(sb);
else
inode = (struct inode *) kmem_cache_alloc(inode_cachep, 
GFP_KERNEL);
and the same goes everywhere else.  Of course we don't check for
sb->s_op not being NULL - that's exactly why we are safe skipping such
tests.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] readahead drop behind and size adjustment

2007-07-24 Thread Eric St-Laurent

On Mon, 2007-23-07 at 19:00 +1000, Nick Piggin wrote:

> I don't like this kind of conditional information going from something
> like readahead into page reclaim. Unless it is for readahead _specific_
> data such as "I got these all wrong, so you can reclaim them" (which
> this isn't).
> 
> But I don't like it as a use-once thing. The VM should be able to get
> that right.
> 

Question: How work the use-once code in the current kernel? Is there
any? I doesn't quite work for me...

See my previous email today, I've done a small test case to demonstrate 
the problem and the effectiveness of Peter's patch.  The only piece
missing is the copy case (read once + write once).

Regardless of how it's implemented, I think a similar mechanism must be
added. This is a long standing issue.

In the end, I think it's a pagecache resources allocation problem. the
VM lacks fair-share limits between processes. The kernel doesn't have
enough information to make the right decisions.

You can refine or use more advanced page reclaim, but some fair-share
splitting (like the CPU scheduler) between the processes must be
present.  Of course some process should have large or unlimited VM
limits, like databases.

Maybe the "containers" patchset and memory controller can help.  With
some specific configuration and/or a userspace daemon to adjust the
limits on the fly.

Independently, the basic large file streaming read (or copy) once cases
should not trash the pagecache. Can we agree on that?

I say, let's add some code to fix the problem.  If we hear about any
regression in some workloads, we can add a tunable to limit or disable
its effects, _if_ a better compromised solution cannot be found.

Surely it's possible to have a acceptable solution.

Best regards,

- Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] fs/super.c: Why alloc_super use a static variable default_op?

2007-07-24 Thread rae l

On 7/25/07, Al Viro <[EMAIL PROTECTED]> wrote:

On Wed, Jul 25, 2007 at 11:48:35AM +0800, rae l wrote:
> Why alloc_super use a static variable default_op?
> the static struct super_operations default_op is just all zeros, and
> just referenced as the initial value of a new allocated super_block,
> what does it for?

So that we would not have to care about ->s_op *ever* being NULL.

But is it valuable? Compared to a waste of sizeof(struct super_block)
bytes memory.

When some code want to refer fs_type->s_op, it almost always want to
refer some function pointer in s_op with fs_type->s_op->***, but all
pointers in default_op are all NULLs, what about this scenario?

and if you do grep s_op in the source code, you will found nowhere
will want to test s_op or dependent on s_op not NULL.

So my opinion is to remove default_ops, just keep new allocated s_op NULL.

--
Denis Cheng
Linux Application Developer

"One of my most productive days was throwing away 1000 lines of code."
- Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-24 Thread Yinghai Lu

On 7/24/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:

On Tuesday 24 July 2007 02:33:05 pm Yinghai Lu wrote:
> I have a system that has the same problem, and it turns out that FW
> missed PNP0501 is DSDT for uart. and add that it into DSDT works well.

Is this FW that has been shipped?  Can you give any more details,
like DMI info and a copy of the DSDT?  We can't expect users to
upgrade their firmware or use a custom DSDT.

The system is not shipped yet.
Normally PNP0501 is coming with superio section in DSDT. So i think
late BIOS if have acpi there, that should be there already.
Problem is that some new design may get rid of superio, but SB could
have extra uart for serial port. at that case BIOS may not have that
PNP0501...

I don't think revert is reasonable.

or we can make legacy_serial.force=1 is default at this point.

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] fs/super.c: Why alloc_super use a static variable default_op?

2007-07-24 Thread Al Viro

On Wed, Jul 25, 2007 at 11:48:35AM +0800, rae l wrote:
> Why alloc_super use a static variable default_op?
> the static struct super_operations default_op is just all zeros, and
> just referenced as the initial value of a new allocated super_block,
> what does it for?

So that we would not have to care about ->s_op *ever* being NULL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-24 Thread Nick Piggin


Benjamin Herrenschmidt wrote:

On Tue, 2007-07-24 at 17:55 -0400, Trond Myklebust wrote:


If you want to use bitops as spinlocks you should rather be using
. That also does the right thing w.r.t.
pre-emption and sparse locking annotations.



Heh, I didn't know about those... A bit annoying that I can't override
them in the arch, I might be able to save a barrier or two here. Our


I guess the test_and_set_bit_lock / clear_bit_unlock will allow you to
override them in a way.

The big performance problem I see on my powerpc system is not the bit
spinlocks (open-coded or not), but the bit sleep locks.

Anyway, I'll finally send out the lock bitops patches again today...

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-24 Thread Nick Piggin


Ray Lee wrote:

On 7/23/07, Nick Piggin <[EMAIL PROTECTED]> wrote:



Also a random day at the desktop, it is quite a broad scope and
pretty well impossible to analyse.



It is pretty broad, but that's also what swap prefetch is targetting.
As for hard to analyze, I'm not sure I agree. One can black-box test
this stuff with only a few controls. e.g., if I use the same apps each
day (mercurial, firefox, xorg, gcc), and the total I/O wait time
consistently goes down on a swap prefetch kernel (normalized by some
control statistic, such as application CPU time or total I/O, or
something), then that's a useful measurement.


I'm not saying that we can't try to tackle that problem, but first of
all you have a really nice narrow problem where updatedb seems to be
causing the kernel to completely do the wrong thing. So we start on
that.



If we can first try looking at
some specific problems that are easily identified.



Always easier, true. Let's start with "My mouse jerks around under
memory load." A Google Summer of Code student working on X.Org claims
that mlocking the mouse handling routines gives a smooth cursor under
load ([1]). It's surprising that the kernel would swap that out in the
first place.

[1] 
http://vignatti.wordpress.com/2007/07/06/xorg-input-thread-summary-or-something/ 


OK, I'm not sure what the point is though. Under heavy memory load,
things are going to get swapped out... and swap prefetch isn't going
to help there (at least, not during the memory load).

There are also other issues like whether the CPU scheduler is at fault,
etc. Interactive workloads are always the hardest to work out. updatedb
is a walk in the park by comparison.



Looking at your past email, you have a 1GB desktop system and your
overnight updatedb run is causing stuff to get swapped out such that
swap prefetch makes it significantly better. This is really
intriguing to me, and I would hope we can start by making this
particular workload "not suck" without swap prefetch (and hopefully
make it even better than it currently is with swap prefetch because
we'll try not to evict useful file backed pages as well).



updatedb is an annoying case, because one would hope that there would
be a better way to deal with that highly specific workload. It's also
pretty stat dominant, which puts it roughly in the same category as a
git diff. (They differ in that updatedb does a lot of open()s and
getdents on directories, git merely does a ton of lstat()s instead.)


Yeah, and I suspect we might be able to do better use-once of
inode and dentry caches. It isn't really highly specific: lots
of things tend to just scan over a few files once -- updatedb
just scans a lot so the problem becomes more noticable.



Anyway, my point is that I worry that tuning for an unusual and
infrequent workload (which updatedb certainly is), is the wrong way to
go.


Well it runs every day or so for every desktop Linux user, and
it has similarities with other workloads. We don't want to optimise
it at the expense of other things, but it _really_ should not be
pushing a 1-2GB desktop into swap, I don't think.



After that we can look at other problems that swap prefetch helps
with, or think of some ways to measure your "whole day" scenario.

So when/if you have time, I can cook up a list of things to monitor
and possibly a patch to add some instrumentation over this updatedb
run.



That would be appreciated. Don't spend huge amounts of time on it,
okay? Point me the right direction, and we'll see how far I can run
with it.


I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo
before and after the updatedb run with the latest kernel would be a
first step. top and vmstat output during the run wouldn't hurt either.

Thanks,
Nick

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] readahead: drop behind

2007-07-24 Thread Eric St-Laurent

On Sat, 2007-21-07 at 23:00 +0200, Peter Zijlstra wrote:

> Use the read-ahead code to provide hints to page reclaim.
> 
> This patch has the potential to solve the streaming-IO trashes my
> desktop problem.
> 
> It tries to aggressively reclaim pages that were loaded in a strong
> sequential pattern and have been consumed. Thereby limiting the damage
> to the current resident set.
> 
> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>

(sorry for the delay)

Ok, I've done some tests with your patches,

I came up with a test program that should approximate my use case. It
simply mmap() and scan (read) a 375M file which represent the usual used
memory on my desktop system.  This data is frequently used, and should
stay cached as much as possible in preference over the "used once" data
read in the page cache when copying large files. I don't claim that the
test program is perfect or even correct, I'm open for suggestions.

Test system:

- Linux x86_64 2.6.23-rc1
- 1G of RAM
- I use the basic drop behind and sysctl patches. The readahead size
patch is _not_ included.


Setting up:

dd if=/dev/zero of=/tmp/375M_file bs=1M count=375
dd if=/dev/zero of=/tmp/5G_file bs=1M count=5120

Tests with stock kernel (drop behind disabled):

echo 0 >/proc/sys/vm/drop_behind

Base test:

sync; echo 1 >/proc/sys/vm/drop_caches
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file

1st execution: 0m7.146s
2nd execution: 0m1.119s
3rd execution: 0m1.109s
4th execution: 0m1.105s

Reading a large file test:

sync; echo 1 >/proc/sys/vm/drop_caches
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file
cp /tmp/5G_file /dev/null
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file

1st execution: 0m7.224s
2nd execution: 0m1.114s
3rd execution: 0m7.178s <<< Much slower
4th execution: 0m1.115s

Copying (read+write) a large file test:

sync; echo 1 >/proc/sys/vm/drop_caches
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file
cp /tmp/5G_file /tmp/copy_of_5G_file
time ./large_app_load_simul /tmp/375M_file
time ./large_app_load_simul /tmp/375M_file
rm /tmp/copy_of_5G_file

1st execution: 0m7.203s
2nd execution: 0m1.147s
3rd execution: 0m7.238s <<< Much slower
4th execution: 0m1.129s

Tests with drop behind enabled:

echo 1 >/proc/sys/vm/drop_behind

Base test:

[same tests as above]

1st execution: 0m7.206s
2nd execution: 0m1.110s
3rd execution: 0m1.102s
4th execution: 0m1.106s

Reading a large file test:

[same tests as above]

1st execution: 0m7.197s
2nd execution: 0m1.116s
3rd execution: 0m1.114s <<< Great!!!
4th execution: 0m1.111s

Copying (read+write) a large file test:

[same tests as above]

1st execution: 0m7.186s
2nd execution: 0m1.111s
3rd execution: 0m7.339s <<< Not fixed
4th execution: 0m1.121s


Conclusion:

- The drop-behind patch works and really prevents the page cache content
from being fulled with useless read-once data.

- It doesn't help the copy (read+write) case. This should also be fixed,
as it's a common workload.

Tested-By: Eric St-Laurent ([EMAIL PROTECTED])



Best regards,

- Eric

(*) Test program and batch file are attached.

diff -urN linux-2.6/include/linux/swap.h linux-2.6-drop-behind/include/linux/swap.h
--- linux-2.6/include/linux/swap.h	2007-07-21 18:26:00.0 -0400
+++ linux-2.6-drop-behind/include/linux/swap.h	2007-07-22 16:22:48.0 -0400
@@ -180,6 +180,7 @@
 /* linux/mm/swap.c */
 extern void FASTCALL(lru_cache_add(struct page *));
 extern void FASTCALL(lru_cache_add_active(struct page *));
+extern void FASTCALL(lru_demote(struct page *));
 extern void FASTCALL(activate_page(struct page *));
 extern void FASTCALL(mark_page_accessed(struct page *));
 extern void lru_add_drain(void);
diff -urN linux-2.6/kernel/sysctl.c linux-2.6-drop-behind/kernel/sysctl.c
--- linux-2.6/kernel/sysctl.c	2007-07-21 18:26:01.0 -0400
+++ linux-2.6-drop-behind/kernel/sysctl.c	2007-07-22 16:20:27.0 -0400
@@ -163,6 +163,7 @@
 
 extern int prove_locking;
 extern int lock_stat;
+extern int sysctl_dropbehind;
 
 /* The default sysctl tables: */
 
@@ -1048,6 +1049,14 @@
 		.extra1		= ,
 	},
 #endif
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "drop_behind",
+		.data		= _dropbehind,
+		.maxlen		= sizeof(sysctl_dropbehind),
+		.mode		= 0644,
+		.proc_handler	= _dointvec,
+	},
 /*
  * NOTE: do not add new entries to this table unless you have read
  * Documentation/sysctl/ctl_unnumbered.txt
diff -urN linux-2.6/mm/readahead.c linux-2.6-drop-behind/mm/readahead.c
--- linux-2.6/mm/readahead.c	2007-07-21 18:26:01.0 -0400
+++ linux-2.6-drop-behind/mm/readahead.c	2007-07-22 16:41:47.0 -0400
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 void default_unplug_io_fn(struct backing_dev_info *bdi, struct page *page)
 {
@@ -429,6 +430,8 @@
 }

[RFC] fs/super.c: Why alloc_super use a static variable default_op?

2007-07-24 Thread rae l


Why alloc_super use a static variable default_op?
the static struct super_operations default_op is just all zeros, and
just referenced as the initial value of a new allocated super_block,
what does it for?

the filesystem dependent code such as ext2_fill_super would fill this
field eventually,
and after carefully checked, it seems no one filesystem would need a
all zero default_op,

as the command output in the kernel source tree:
$ grep -RInw s_op fs/
You could check all the use of s_op.

/**
*   alloc_super -   create new superblock
*   @type:  filesystem type superblock should belong to
*
*   Allocates and initializes a new  super_block.  alloc_super()
*   returns a pointer new superblock or %NULL if allocation had failed.
*/
static struct super_block *alloc_super(struct file_system_type *type)
{
struct super_block *s = kzalloc(sizeof(struct super_block),  GFP_USER);

static struct super_operations default_op;

if (s) {
...
s->s_op = _op;
s->s_time_gran = 10;
}
out:
return s;
}


--
Denis Cheng
Linux Application Developer

"One of my most productive days was throwing away 1000 lines of code."
- Ken Thompson.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1 regression: mm: fix fault vs invalidate race for linear mappings

2007-07-24 Thread Nick Piggin


Bret Towe wrote:

for a while in -git I've had an issue that on boot when gdm loads the
screen stays black
using ctrl-f1 doesn't return to a console and killing X doesn't help any
ssh'ing into the box does work top only shows 100% io-wait
dmesg shows nothing odd

the work around I have is at the moment is to just move the radeon drm 
module

out of the way so it doesn't load on boot and X works just fine like that
I did some bisecting which took a few days and tracked it down to
commit d00806b183152af6d24f46f0c33f14162ca1262a

its way to complex for me to revert it on top of -rc1 to verify that's
the issue tho
I keep forgetting to get a trace of what its waiting on when I'm in that 
kernel

I assume that would be of use and Ill get that later

the box this is happening on is a g4 mac mini the built in card is a 
radeon 9200
I'm not seeing any issues on an amd64 box with radeon card it's also a 
9600 tho


Is this with a binary-only module? We saw an issue with that in SLES9
where the module is returning a locked page from its nopage handler
when it isn't really supposed to. It might be fixed in latest drivers,
have you tried them?

Thanks,
Nick

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

2007-07-24 Thread William Lee Irwin III

On Wed, Jul 18, 2007 at 06:32:22AM -0700, William Lee Irwin III wrote:
>> Actually I'd worked on what was called MPSS (Multiple Page Size Support)
>> before I ever started on pgcl. Some large portion of the pgcl proposal
>> as I presented it internally was to reduce the order of large page
>> allocations and provide a promotion and demotion mechanism enabling
>> different processes to have different sized translations for the same
>> large page, and hence no out-of-context pagetable/TLB updates during
>> promotion and demotion, essentially by making the TLB translation to
>> page relation M:N. ISTR describing this in a KS presentation for which
>> IIRC you were present. But that's neither here nor there.

On Tue, Jul 24, 2007 at 09:44:18PM +0200, Andrea Arcangeli wrote:
> Well the whole difference between you back then and SGI now, is that
> your stuff wasn't being pushed to be merged very hard (it was proposed
> but IIRC more as research topic, like the large PAGE_SIZE also fallen
> into that same research area). See now the emails from SGI fs folks
> about variable order page size, they want it merged badly instead.

Neither were research topics, but I'm tired of correcting the history
of my failures. I've got enough ongoing failures as things stand.

On Tue, Jul 24, 2007 at 09:44:18PM +0200, Andrea Arcangeli wrote:
> My whole point is that the single moment the variable order page size
> isn't pure research anymore like MPSS, the CONFIG_PAGE_SHIFT isn't
> research anymore either, like the tail packing in pagecache with
> kmalloc also isn't research anymore.

There was never any research involved in the page clustering per se.
It was supposed to be a generally advantageous thing that Linus had
at least once explicitly approved of that just so happened to relieve
mem_map[] pressure on 64GB i386, the side effect intended to attract
corporate patronage.

That last fact was not only demonstrable, it was used in the first
ever public demonstration of a 64GB i386 machine running Linux, which
I personally carried out.

Beyond active hindrances and lacks of cooperation, a "competing
solution" with distro backing appeared that removed the last vestige
of corporate patronage from the project. It ended up bitrotting
faster than I could singlehandedly do all the maintenance, testing,
and coding work on it while also trying to get anything else done.

MPSS was not as well-developed at the time the hugetlb "solution"
killed it, but is not terribly dissimilar in how it came into
being, developed, and then died, apart from less active hindrance.

The one and only aspect in which any research was involved was a
proposal, never accepted or pursued, to investigate how larger
base page sizes implemented via page clustering mitigated external
fragmentation for the purposes of MPSS and also how certain
techniques borrowed from page clustering could reduce the frequency
of and performance penalties associated with demotion in MPSS. The
proposal has never been publicly circulated, though some of its content
was described in the KS presentation as "future directions" or similar.

On Tue, Jul 24, 2007 at 09:44:18PM +0200, Andrea Arcangeli wrote:
> About the fs deciding the size of the pagecache granularity I totally
> dislike that design, there's no reason why the fs should control that,
[...]

This is all valid commentary, though I don't have any particular
response to it.

In any event, I've never been involved in a research project, though
I would've liked to have been. The emphasis in all cases was enabling
specific functionality in production, using techniques whose viability
had furthermore already been demonstrated elsewhere, by others.

In both instances, insurmountable nontechnical obstacles were present,
which remain in place and effectively limit the scale and scope of any
sort of project I can personally lead with any sort of likelihood of
mainline acceptance.

Where I am limited, you are not. Good luck to you.

-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] getting rid of stupid loop in BUG()

2007-07-24 Thread Keith Owens

Trent Piepho (on Tue, 24 Jul 2007 19:31:36 -0700 (PDT)) wrote:
>Adding __builtin_trap after the
>asm might be an ok fix.  It will emit a spurious int 6, but that won't even be
>reached since the asm doesn't return, and it probably be less extra code than
>the loop.

int 6 is a two byte instruction, the loop generates jmp with an 8 bit
offset, also two bytes.  No change in code size.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-24 Thread Bjorn Helgaas

On Tuesday 24 July 2007 02:33:05 pm Yinghai Lu wrote:
> I have a system that has the same problem, and it turns out that FW
> missed PNP0501 is DSDT for uart. and add that it into DSDT works well.

Is this FW that has been shipped?  Can you give any more details,
like DMI info and a copy of the DSDT?  We can't expect users to
upgrade their firmware or use a custom DSDT.

Bjorn
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] extent mapped page cache

2007-07-24 Thread Nick Piggin

On Tue, Jul 24, 2007 at 07:25:09PM -0400, Chris Mason wrote:
> On Tue, 24 Jul 2007 23:25:43 +0200
> Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> 
> The tree is a critical part of the patch, but it is also the easiest to
> rip out and replace.  Basically the code stores a range by inserting
> an object at an index corresponding to the end of the range.
> 
> Then it does searches by looking forward from the start of the range.
> More or less any tree that can search and return the first key >=
> than the requested key will work.
> 
> So, I'd be happy to rip out the tree and replace with something else.
> Going completely lockless will be tricky, its something that will deep
> thought once the rest of the interface is sane.

Just having the other tree and managing it is what makes me a little
less positive of this approach, especially using it to store pagecache
state when we already have the pagecache tree.

Having another tree to store block state I think is a good idea as I
said in the fsblock thread with Dave, but I haven't clicked as to why
it is a big advantage to use it to manage pagecache state. (and I can
see some possible disadvantages in locking and tree manipulation overhead).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] getting rid of stupid loop in BUG()

2007-07-24 Thread Trent Piepho

On Tue, 24 Jul 2007, Al Viro wrote:
>   AFAICS, the patch below should do it for i386; instead of
> using a dummy loop to tell gcc that this sucker never returns,
> we do
> static void __always_inline __noreturn __BUG(const char *file, int line);
> containing the actual asm we want to insert and define BUG() as
> __BUG(__FILE__, __LINE__).  It looks safe, but I don't claim enough
> experience with gcc __asm__ potential nastiness, so...

Sounds like it doesn't work:
http://gcc.gnu.org/ml/gcc/2007-02/msg00107.html

[The] programmer won't get optimization he wants as after inlining this as
after inlining this attribute information becomes completely lost.

What about __builtin_trap?

It results in int 6 that might not be applicable, but adding some control
over it to i386 backend is definitly an option.

Honza

It seems like if __BUG() is not inlined, you get the bogus noreturn does
return warning.  If it is inlined, then you lose the noreturn attribute and
un-reachable code paths aren't eliminated.  Adding __builtin_trap after the
asm might be an ok fix.  It will emit a spurious int 6, but that won't even be
reached since the asm doesn't return, and it probably be less extra code than
the loop.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] hwmon: Add missing __devexit tags in various drivers

2007-07-24 Thread Mark M. Hoffman

Hi Jean:
* Jean Delvare <[EMAIL PROTECTED]> [2007-07-22 12:09:48 +0200]:
> On Sun, 22 Jul 2007 00:30:56 +0200, Gabriel C wrote:
> > I noticed this warnings on current git:
> > 
> > drivers/hwmon/pc87360.c:1082: warning: 'pc87360_remove' defined but not used
> > drivers/hwmon/sis5595.c:580: warning: 'sis5595_remove' defined but not used
> > drivers/hwmon/smsc47m1.c:608: warning: 'smsc47m1_remove' defined but not 
> > used
> > drivers/hwmon/via686a.c:648: warning: 'via686a_remove' defined but not used
> > drivers/hwmon/vt8231.c:755: warning: 'vt8231_remove' defined but not used
> 
> Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
> ---
>  drivers/hwmon/it87.c |2 +-
>  drivers/hwmon/pc87360.c  |2 +-
>  drivers/hwmon/sis5595.c  |2 +-
>  drivers/hwmon/smsc47m1.c |2 +-
>  drivers/hwmon/via686a.c  |2 +-
>  drivers/hwmon/vt8231.c   |4 ++--
>  drivers/hwmon/w83627hf.c |2 +-
>  7 files changed, 8 insertions(+), 8 deletions(-)
> 
> --- linux-2.6.23-pre.orig/drivers/hwmon/it87.c2007-07-22 
> 11:51:47.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/it87.c 2007-07-22 11:56:48.0 
> +0200
> @@ -252,7 +252,7 @@ struct it87_data {
>  
>  
>  static int it87_probe(struct platform_device *pdev);
> -static int it87_remove(struct platform_device *pdev);
> +static int __devexit it87_remove(struct platform_device *pdev);
>  
>  static int it87_read_value(struct it87_data *data, u8 reg);
>  static void it87_write_value(struct it87_data *data, u8 reg, u8 value);
> --- linux-2.6.23-pre.orig/drivers/hwmon/pc87360.c 2007-07-22 
> 09:54:08.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/pc87360.c  2007-07-22 11:56:48.0 
> +0200
> @@ -220,7 +220,7 @@ struct pc87360_data {
>   */
>  
>  static int pc87360_probe(struct platform_device *pdev);
> -static int pc87360_remove(struct platform_device *pdev);
> +static int __devexit pc87360_remove(struct platform_device *pdev);
>  
>  static int pc87360_read_value(struct pc87360_data *data, u8 ldi, u8 bank,
> u8 reg);
> --- linux-2.6.23-pre.orig/drivers/hwmon/sis5595.c 2007-07-22 
> 09:54:08.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/sis5595.c  2007-07-22 11:56:48.0 
> +0200
> @@ -187,7 +187,7 @@ struct sis5595_data {
>  static struct pci_dev *s_bridge; /* pointer to the (only) sis5595 */
>  
>  static int sis5595_probe(struct platform_device *pdev);
> -static int sis5595_remove(struct platform_device *pdev);
> +static int __devexit sis5595_remove(struct platform_device *pdev);
>  
>  static int sis5595_read_value(struct sis5595_data *data, u8 reg);
>  static void sis5595_write_value(struct sis5595_data *data, u8 reg, u8 value);
> --- linux-2.6.23-pre.orig/drivers/hwmon/smsc47m1.c2007-07-22 
> 09:54:08.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/smsc47m1.c 2007-07-22 11:56:48.0 
> +0200
> @@ -134,7 +134,7 @@ struct smsc47m1_sio_data {
>  
>  
>  static int smsc47m1_probe(struct platform_device *pdev);
> -static int smsc47m1_remove(struct platform_device *pdev);
> +static int __devexit smsc47m1_remove(struct platform_device *pdev);
>  static struct smsc47m1_data *smsc47m1_update_device(struct device *dev,
>   int init);
>  
> --- linux-2.6.23-pre.orig/drivers/hwmon/via686a.c 2007-07-22 
> 09:54:08.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/via686a.c  2007-07-22 11:56:48.0 
> +0200
> @@ -314,7 +314,7 @@ struct via686a_data {
>  static struct pci_dev *s_bridge; /* pointer to the (only) via686a */
>  
>  static int via686a_probe(struct platform_device *pdev);
> -static int via686a_remove(struct platform_device *pdev);
> +static int __devexit via686a_remove(struct platform_device *pdev);
>  
>  static inline int via686a_read_value(struct via686a_data *data, u8 reg)
>  {
> --- linux-2.6.23-pre.orig/drivers/hwmon/vt8231.c  2007-07-22 
> 09:54:08.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/vt8231.c   2007-07-22 11:56:48.0 
> +0200
> @@ -167,7 +167,7 @@ struct vt8231_data {
>  
>  static struct pci_dev *s_bridge;
>  static int vt8231_probe(struct platform_device *pdev);
> -static int vt8231_remove(struct platform_device *pdev);
> +static int __devexit vt8231_remove(struct platform_device *pdev);
>  static struct vt8231_data *vt8231_update_device(struct device *dev);
>  static void vt8231_init_device(struct vt8231_data *data);
>  
> @@ -751,7 +751,7 @@ exit_release:
>   return err;
>  }
>  
> -static int vt8231_remove(struct platform_device *pdev)
> +static int __devexit vt8231_remove(struct platform_device *pdev)
>  {
>   struct vt8231_data *data = platform_get_drvdata(pdev);
>   int i;
> --- linux-2.6.23-pre.orig/drivers/hwmon/w83627hf.c2007-07-22 
> 11:51:49.0 +0200
> +++ linux-2.6.23-pre/drivers/hwmon/w83627hf.c 2007-07-22 11:56:48.0 
> +0200
> @@ -384,7 +384,7 @@ struct w83627hf_sio_data {
>  
>  
>  static int w83627hf_probe(struct platform_device *pdev);
>

Re: [PATCH][07/37] Clean up duplicate includes in drivers/hwmon/

2007-07-24 Thread Mark M. Hoffman

Hi Jesper:

* Jesper Juhl <[EMAIL PROTECTED]> [2007-07-21 17:02:01 +0200]:
> Hi,
> 
> This patch cleans up duplicate includes in
>   drivers/hwmon/
> 
> 
> Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
> ---
> 
> diff --git a/drivers/hwmon/ams/ams-core.c b/drivers/hwmon/ams/ams-core.c
> index 6db9737..a112a03 100644
> --- a/drivers/hwmon/ams/ams-core.c
> +++ b/drivers/hwmon/ams/ams-core.c
> @@ -23,7 +23,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  

Applied to hwmon-2.6.git/testing, thanks.

-- 
Mark M. Hoffman
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: v2.6.22.1-rt5

2007-07-24 Thread Gene Heskett

On Tuesday 24 July 2007, Ingo Molnar wrote:
>* Gene Heskett <[EMAIL PROTECTED]> wrote:
>> The above stanza still needs some tlc.  I built a 2.6.22.1-rt6 (rt5
>> wouldn't build) using the same old config that a make oldconfig didn't
>> fuss about, but the reboot never completed, see the attached, heavily
>> smunched camera shot of the panic.
>>
>> Kinda looks like hda/sda confusion, with rt3 (this boot), its hda*,
>> what is it now?  fstab or kernel config error?
>
>yeah, as long as your filesystems are created with a proper label, all
>that you need to do is to change all 'hda' to 'sda' in the new kernel's
>/etc/grub.conf entry. (or enable the old IDE code in the .config, under
>CONFIG_IDE)
>
>   Ingo

I believe it is on:

[EMAIL PROTECTED] linux-2.6.22.1-rt6]# grep CONFIG_IDE .config
CONFIG_IDE=y
# CONFIG_IDEDISK_MULTI_MODE is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y
CONFIG_IDE_GENERIC=y
# CONFIG_IDEPCI_SHARE_IRQ is not set
CONFIG_IDEPCI_PCIBUS_ORDER=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
# CONFIG_IDEDMA_IVB is not set


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Computer programmers never die, they just get lost in the processing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rtc-linux] Re: rtc-ds1307.c: array overrun

2007-07-24 Thread Alessandro Zummo

On Sun, 22 Jul 2007 18:17:17 -0700
David Brownell <[EMAIL PROTECTED]> wrote:

> 
> On Sunday 22 July 2007, Adrian Bunk wrote:
> > The Coverity checker spotted the following array overrun
> > in drivers/rtc/rtc-ds1307.c:
> 
> Typo -- thanks, fix is attached.
> 
>   CUT HERE
> Fix a typo turned up by a Coverity check:  referring to the wrong register,
> which could cause problems restarting DS1338 RTCs after their oscillator
> halted.  (For example, if the backup battery died.)
> 
> Signed-off-by: David Brownell <[EMAIL PROTECTED]>

 
 Acked-by: Alessandro Zummo <[EMAIL PROTECTED]>

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rtc-linux] [PATCH] s3c2410: fixup after arch moves

2007-07-24 Thread Alessandro Zummo

On Tue, 24 Jul 2007 13:40:04 +0100
Ben Dooks <[EMAIL PROTECTED]> wrote:

> 
> Fixup the changes from moving around the arch
> support for s3c24xx based systems.
> 
> Signed-off-by: Ben Dooks <[EMAIL PROTECTED]>


 Acked-by: Alessandro Zummo <[EMAIL PROTECTED]>


-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix arch/i386/kernel/nmi.c - 'unknown_nmi_panic_callback' declared 'static' but never defined warning

2007-07-24 Thread Andrew Morton

On Sun, 22 Jul 2007 21:20:38 +0200 Gabriel C <[EMAIL PROTECTED]> wrote:

> I get this warning when CONFIG_SYSCTL is not set :
> 
> ...
> 
> arch/i386/kernel/nmi.c:52: warning: 'unknown_nmi_panic_callback' declared 
> 'static' but never defined
> 
> ...
> 
> Signed-off-by: Gabriel Craciunescu <[EMAIL PROTECTED]>
> 
> ---
> 
> diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
> index 03b7f55..cf11121 100644
> --- a/arch/i386/kernel/nmi.c
> +++ b/arch/i386/kernel/nmi.c
> @@ -49,8 +49,9 @@ static unsigned int nmi_hz = HZ;
>  static DEFINE_PER_CPU(short, wd_enabled);
>  
>  /* local prototypes */
> +#ifdef CONFIG_SYSCTL
>  static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu);
> -
> +#endif
>  static int endflag __initdata = 0;

guys, please take a closer look at the code which you're changing?  We can
obviously move do_nmi_callback() down to above
__trigger_all_cpu_backtrace() and then do away with this declaration
altogether.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: v2.6.22.1-rt5

2007-07-24 Thread Gene Heskett

On Tuesday 24 July 2007, Ingo Molnar wrote:
>* Gene Heskett <[EMAIL PROTECTED]> wrote:
>> The above stanza still needs some tlc.  I built a 2.6.22.1-rt6 (rt5
>> wouldn't build) using the same old config that a make oldconfig didn't
>> fuss about, but the reboot never completed, see the attached, heavily
>> smunched camera shot of the panic.
>>
>> Kinda looks like hda/sda confusion, with rt3 (this boot), its hda*,
>> what is it now?  fstab or kernel config error?
>
>yeah, as long as your filesystems are created with a proper label, all
>that you need to do is to change all 'hda' to 'sda' in the new kernel's
>/etc/grub.conf entry. (or enable the old IDE code in the .config, under
>CONFIG_IDE)
>
>   Ingo

Changing the "root (hd0,0)" to (sd0,0) failed.  Grub can't parse the (sd0,0).

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
WHERE CAN THE MATTER BE
Oh, dear, where can the matter be
When it's converted to energy?
There is a slight loss of parity.
Johnny's so long at the fair.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-24 Thread David Miller

From: "Matthew Hawkins" <[EMAIL PROTECTED]>
Date: Wed, 25 Jul 2007 11:26:57 +1000

> On 7/24/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > The other consideration here is, as Nick points out, are the problems which
> > people see this patch solving for them solveable in other, better ways?
> > IOW, is this patch fixing up preexisting deficiencies post-facto?
> 
> So let me get this straight - you don't want to merge swap prefetch
> which exists now and solves issues many people are seeing, and has
> been tested more than a gazillion other bits & pieces that do get
> merged - because it could be possible that in the future some other
> patch, which doesn't yet exist and nobody is working on, may solve the
> problem better?

I have to generally agree that the objections to the swap prefetch
patches have been conjecture and in general wasting time and
frustrating people.

There is a point at which it might be wise to just step back and let
the river run it's course and see what happens.  Initially, it's good
to play games of "what if", but after several months it's not a
productive thing and slows down progress for no good reason.

If a better mechanism gets implemented, great!  We'll can easily
replace the swap prefetch stuff at such time.  But until then swap
prefetch is what we have and it's sat long enough in -mm with no major
problems to merge it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] powerpc: Pegasos keyboard detection

2007-07-24 Thread Alan Curry

As of 2.6.22 the kernel doesn't recognize the i8042 keyboard/mouse controller
on the PegasosPPC. This is because of a feature/bug in the OF device tree:
the "device_type" attribute is an empty string instead of "8042" as the
kernel expects. This patch (against 2.6.22.1) adds a secondary detection
which looks for a device whose *name* is "8042" if there is no device whose
*type* is "8042".

Signed-off-by: Alan Curry <[EMAIL PROTECTED]>

--- arch/powerpc/kernel/setup-common.c.orig 2007-07-24 19:04:17.0 
-0500
+++ arch/powerpc/kernel/setup-common.c  2007-07-24 19:06:36.0 -0500
@@ -487,6 +487,10 @@ int check_legacy_ioport(unsigned long ba
switch(base_port) {
case I8042_DATA_REG:
np = of_find_node_by_type(NULL, "8042");
+   /* Pegasos has no device_type on its 8042 node, look for the
+* name instead */
+   if (!np)
+   np = of_find_node_by_name(NULL, "8042");
break;
case FDC_BASE: /* FDC1 */
np = of_find_node_by_type(NULL, "fdc");

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: v2.6.22.1-rt5

2007-07-24 Thread Gene Heskett

On Tuesday 24 July 2007, Ingo Molnar wrote:
>* Gene Heskett <[EMAIL PROTECTED]> wrote:
>> The above stanza still needs some tlc.  I built a 2.6.22.1-rt6 (rt5
>> wouldn't build) using the same old config that a make oldconfig didn't
>> fuss about, but the reboot never completed, see the attached, heavily
>> smunched camera shot of the panic.
>>
>> Kinda looks like hda/sda confusion, with rt3 (this boot), its hda*,
>> what is it now?  fstab or kernel config error?
>
>yeah, as long as your filesystems are created with a proper label, all
>that you need to do is to change all 'hda' to 'sda' in the new kernel's
>/etc/grub.conf entry. (or enable the old IDE code in the .config, under
>CONFIG_IDE)
>
>   Ingo

Damn, I didn't say it clear enough, the / is on an LVM volume.  /Boot is 
on /dev/hda1, aka (hd0,0) in the first line.  Since the msg pointed at 0,0, 
I'll switch that line to "root (sd0,0)" just for grins.

I take it that was an auto-conversion?  I did nothing to confirm any changes 
when I ran a make oldconfig, using the 2.6.22.1-rt3 .config as the src 
config.

Thanks.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
But I was there and I saw what you did,
I saw it with my own two eyes.
So you can wipe off that grin;
I know where you've been--
It's all been a pack of lies!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-24 Thread Matthew Hawkins


On 7/24/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

The other consideration here is, as Nick points out, are the problems which
people see this patch solving for them solveable in other, better ways?
IOW, is this patch fixing up preexisting deficiencies post-facto?


So let me get this straight - you don't want to merge swap prefetch
which exists now and solves issues many people are seeing, and has
been tested more than a gazillion other bits & pieces that do get
merged - because it could be possible that in the future some other
patch, which doesn't yet exist and nobody is working on, may solve the
problem better?

You know what, just release Linux 0.02 as 2.6.23 because, using your
logic, everything that was merged since October 5, 1991 could be
replaced by something better.  Perhaps.  So there's obviously no point
having it there in the first place & there'll be untold savings in
storage costs and compilation time for the kernel tree, also bandwidth
for the mirror sites etc. in the mean time while we wait for the magic
pixies to come and deliver the one true piece of code that cannot be
improved upon.


Well.  The above, plus there's always a lot of stuff happening in MM land,
and I haven't seen much in the way of enthusiasm from the usual MM
developers.


I haven't seen much in the way of enthusiasm from developers, period.
People are tired of maintaining patches for years that never get
merged into mainline because of totally bullshit reasons (usually
amounting to NIH syndrome)

--
Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-24 Thread Andrew Morton

On Wed, 25 Jul 2007 11:09:21 +1000 Paul Mackerras <[EMAIL PROTECTED]> wrote:

> Also, I prefer the style where the ? and : operators have a space
> after them but not before them, rather than a space either side.

Could I point out that your likes and dislikes are immaterial?  The whole
point here is to get kernel code looking consistent.  That means that
basically everyone ends up doing things which they'd prefer not to do. 
That certainly applies to me.  

The idea is that the benefit of making things consistent exceeds the costs
of some individuals adopting styles which they are less used to.

So telling people what you do and don't like is simply irrelevant, except
for when it is used as an input in determining what the standard kernel
style is to be.  (And that is largely determined by observing what we have
now).

And sure, major subsytems can and do go off and do their own thing - ia64
for example has done a lot of that, pretty consistently.  The world hasn't
ended as a result.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Time Problems with 2.6.23-rc1-gf695baf2

2007-07-24 Thread Bartlomiej Zolnierkiewicz

On Wednesday 25 July 2007, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi,
> 
> On Wednesday 25 July 2007, Michal Piotrowski wrote:
> > Hi,
> > 
> > On 24/07/07, Eric Sesterhenn / Snakebyte <[EMAIL PROTECTED]> wrote:
> > > hi,
> > >
> > > seems like the clock got screwed or something similar. During bootup the
> > > computer hangs (no response on keyboard leds when pressing caps lock),
> > > the only way to make sure it resumes booting is by repeatedly pressing
> > > the power switch,
> > 
> > :)
> > 
> > > see second 13 to 510, after pressing it about ten
> > > times, it continues booting.
> > 
> > Probing IDE interface...
> > 
> > [   13.867939] VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on
> > pci:00:04.1
> > [   13.868062] ide0: BM-DMA at 0xd800-0xd807, BIOS settings:
> > hda:DMA, hdb:pio
> > [   13.868268] ide1: BM-DMA at 0xd808-0xd80f, BIOS settings:
> > hdc:DMA, hdd:DMA
> > [   13.868574] Probing IDE interface ide0...
> > [  387.279576] Clocksource tsc unstable (delta = 370195339890 ns)
> > [  496.200082] hda: ST340823A, ATA DISK drive
> > [  510.264511] hda: selected mode 0x44
> > [  510.264826] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> > 
> > Could you please try to revert these commits
> 
> It doesn't seem like a IDE bug et all, rather seems to be some issue
> related to the recent "lack of the proper clocksource fallback" bug...

or ACPI

> > > [   13.506890] ACPI Exception (processor_throttling-0084): AE_NOT_FOUND, 
> > > Evaluating _PTC [20070126]
> > > [   13.507101] ACPI Exception (processor_throttling-0147): AE_NOT_FOUND, 
> > > Evaluating _TSS [20070126]

Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] update checkpatch.pl to version 0.08

2007-07-24 Thread Adrian Bunk

On Tue, Jul 24, 2007 at 03:32:59PM -0500, jschopp wrote:
>>> Yep I think the consensus is we need a
>>> "--i-don't-agree-just-check-things-which-will-get-me-rejected-out-of-hand"
>>> option of some sort which will restrict output to the real errors.
>> No, the default should be to show only the real errors.
>
> CodingStyle violations are real errors.
>
> If we have agreed that code should look a certain way, and there is a patch 
> that doesn't look that way, that is an error.  Maybe not a runtime error, 
> but a readability error.  A reviewability error.  A maintainability error.  
> A big waste of everybodies time.
>
> I personally don't care if code is indented with 2 spaces, 4 spaces, or a 
> tab.  What I do care about is that all the code is indented consistently so 
> we don't waste an ounce of our energy reading code/patches and thinking 
> about indentation or even worse spending our time arguing over it on 
> mailing lists when there are better things to argue about.
>
> Back when I wrote the early versions of this script I didn't write it 
> because I'm anal retentive about CodingStyle.  I wrote it for the exact 
> opposite reason.  I was tired of seeing email on mailing lists reviewing 
> patches saying there was indentation with spaces instead of tabs, or 
> trailing whitespace, or { on the wrong line.  It was a waste of the 
> reviewers time, it was a waste of the developers time, it was a waste of 
> the time of everybody on the mailing lists.  We should spend all that 
> energy arguing over the merits of what the code does.

There's a relatively small amount of common codingstyle mistakes 
accounting for most of these mistakes.

> So let's argue over the CodingStyle once and be done with the argument 
> instead of having the argument every day on the mailing lists forever.  We 
> end up with more time to argue over much more interesting subjects and we 
> end up with consistent code that is easy to read, review, and maintain.

It's also important to note that there are slightly different 
codingstyles in different parts of the kernel, and you won't get people 
to agree on one.

A common codingstyle is important, but unifying the last bits is simply 
not worth the hassle.

There are more important things than exploiting the corner cases of 
codingstyle, e.g. could you teach checkpatch.pl to give exactly two 
errors for the following code?


while (a);
for (b = 0; b < 50; b++);
for (c = 0; c < sizeof(struct module); c++)
d = e;


cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] ps3: Disk Storage Driver

2007-07-24 Thread Paul Mackerras

Andy Whitcroft writes:

> Ok, this is something we need to decide on.  Currently we only ask for
> consistent spacing on all the mathematic operators.  This is mostly as
> we do see a large number of non-spaced uses in defines and the like.
> 
> I am happy to expand these tests so they are always spaced on both sides
> style if that is the preference.

It depends very much on the context - on the precedence and relative
importance of one operator with respect to other operators and the
statement as a whole.  In general I prefer spaces around binary
operators, but there are situations where not putting spaces around
some operators can enhance the readability of the statement as a
whole.

If checkpatch.pl starts whinging about operators without spaces that
will just be yet another reason not to use it IMHO.

Also, I prefer the style where the ? and : operators have a space
after them but not before them, rather than a space either side.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patches for REALLY TINY 386 kernels

2007-07-24 Thread Yinghai Lu


On 7/24/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Tue, Jul 24, 2007 at 01:50:35PM -0700, Yinghai Lu wrote:
> On 7/24/07, Helge Hafting <[EMAIL PROTECTED]> wrote:
>> Andi Kleen wrote:
>> >> Some people are putting Linux kernels in the "BIOS" (i.e. ROM chip)
>> when
>> >> using LinuxBIOS (www.linuxbios.org). It _does_ make a lot of difference
>> >> there how big the kernel is. At the moment you can't do that with
>> >> anything smaller than a 1 MB chip. But if people could use 512 KB chips
>> >> because the kernel is small enough that would sure be a great thing.
>> >>
>> >
>> > I'm sure it would be possibel to save a lot of text size. But I don't
>> > think removing the relatively small CPUID code is the right way.
>> > That is just a big maintenance issue for little gain.
>> >
>> Well - anyone compiling linux for BIOS usage is targetting
>> a single machine.  So an ability to target a single machine is useful,
>> i.e. run the CPUID at compile-time, put the answer in a constant/macro,
>> let the optimizer prune the alternatives. :-)
>
> we are using AMD64 + LinuxBIOS + Kernel (without acpi) + kexec to load
> final kernel.
> So we can use drivers in kernel for any media (SCSI, SATA, IB,...),
> not like EFI need every driver re-porting. and We could use KVM in
> kernel to load other OS if needed.
>
> The problem is Kernel is getting bigger and bigger. and old Tiny
> kernel is stopping at 2.6.18...
>...

Please send:
- the .config for the last kernel small enough
- your size limit
- your gcc version
and I'll look at this.


http://www.linuxbios.org/Tyan_S2892_Build_Tutorial
http://www.linuxbios.org/pipermail/linuxbios/2006-October/016558.html

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm] Power Management framework proposal

2007-07-24 Thread david


On Wed, 25 Jul 2007, Jerome Glisse wrote:


On 7/24/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

 On Tue, 24 Jul 2007, Jerome Glisse wrote:

>  On 7/23/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >   On Mon, 23 Jul 2007, Igor Stoppa wrote:
> > >   again, HAL / OHM / Mobilin
> > 
> >   I was trying to define the lower level interfaces that these tools 
> >   need.
> >   today they can only know what is possible by reading the source code 
> >   for
> >   each driver and implementing the driver-specific interfaces nessasary 
> >   to
> >   set things, I was proposing a common interface that tools like this 
> >   could

> >   use instead of requiring all the driver-specific knowledge.
> > 
> > 
> >   in a nutshell (and I know this is probably not detailed to be 
> >   acceptable)
> > 
> >   1. the software needs to know what the interconnects and dependancies

> >   between devices are (supposedly this is provided via sysfs)
> > 
> >   2. the software needs to know what type of device this is (again,

> >   supposedly this is provided via sysfs)
> > 
> >   3. the software needs to know what modes exist for a driver/piece of
> >   hardware. to make any decisions this infomation needs to provide 
> >   some
> >   information about the capability of the mode and the power 
> >   consumed in
> >   that mode. in addition there will need to be flags to indicate 
> >   any

> >   special restrictions of a mode
> > 
> >   4. the software needs to know the cost of switching from any mode to 
> >   any
> >   other mode. since some transitions will interact with other 
> >   devices
> >   there will need to be flags to indicate such requirements for 
> >   specific

> >   transitions.
> > 
> >   5. the software needs to be able to find out what mode a device is 
> >   in.
> > 
> >   6. the software needs to be able to tell the driver to switch to a
> >   different mode (I think it would be a very good thing if going to 
> >   a
> >   particular mode was always the same command, no matter what mode 
> >   it is

> >   currently in)
> > 
> >   7. the software needs to figure out the desire of the user.
> > 
> >   my proposal was addressing items #3-#6. it isn't trying to decide 
> >   what to
> >   do, simply to allow the software that _is_ trying to decide what to 
> >   do a

> >   way to find out what it can do.
> > 
> >   David Lang
> 
>  I believe a central place where user can set/change hw state to save

>  power or to increase computational power is definitely a goal to pursue.
>  But i truly think that the OHM approach is the best one ie using plugins
>  so that one can make a plugin specific for each device. The point is 
>  that
>  i believe there is no way to do an abstract interface for this and 
>  trying to
>  do so will endup doing ugly code and any interface would fail to 
>  encompass

>  all possible tweak that might exist for all devices.

 will each plugin have it's own interface? or will you have one interface
 to access the plugins and then the plugins do things behind the scenes?

 I'll bet that the API for the plugins is common, and if so then it could
 be similar to the API that I suggested.


I take here ohm as a reference (this come from my limited understanding of
this daemon so there might be inaccuracy) driver export through HAL
there power management tunning capacity, Then an ohm plugin would use
HAL to give a higher
view of this capacity and also manage policy, preference, permission, ...

Last consumer in power management food chain would be an user interface which
will communicate with ohm (and with all ohm plugin) so desktop writter 
(gnome,
kde, ...) can write some kind of power management center where each ohm 
plugin

can have its own panel. So in the end the user got one place to do all its
power management which is the goal i think you are trying to aim.


no. I am talking about the interface to the drivers that things like HAL 
would use



>  For instance on graphics card you could do the following (maybe more):
>  -change GPU clock
>  -change memory clock
>  -disable part of engine
>  -disable unit
>  i truly don't think you can make a common interface for all this, more
>  over there might be constraint on how you can change things (GPU &
>  memory clock might need to follow a given ratio). So you definitely
>  need knowledge in the user space program to handle this.

 sure you can, just enumerate all the options the driver writer wants to
 offer as options. yes this could be a lengthy list, so what?



My point was that your interface by trying to fit square pegs into round hole
will fail to expose all subtility of each device which might in the end bring
to wrong power management decision. So i believe we can't sum up
power management to list of mode whose attribute are power consumption
& capacity.


it's possible (which is part of the reason I started the thread), but so 
far there

Re: [patch 2.6.23-rc1] dma_free_coherent() needs irqs enabled (sigh)

2007-07-24 Thread David Brownell

On Tuesday 24 July 2007, Russell King wrote:
> > > 
> > > I think you got the year wrong:
> > > 
> > > 5edf71ae (Russell King      2005-11-25 15:52:51 + 364)      
> > > WARN_ON(irqs_disabled());
> > > 
> > > which is due to this commit:
> > > 
> > > [ARM] Do not call flush_tlb_kernel_range() with IRQs disabled.
> > 
> > This little "to do" list item has been sitting in my mailbox way
> > too long then.  Certainly since it was fair to say "last year"!  ;)
> 
> Are you intentionally not reading what I said?

Hardly.  Go back and read what *I* wrote!

It just took a while to notice that behavioral change, since
I don't normally run the relevant regression tests using lockdep.
It was sometime in the first half of 2006, ergo "since it was
fair to say 'last year'".

A bunch of piecemeal workarounds followed; and recently they were
all replaced with a more fundamental fix.  This doc patch was the
tail end of the process of recovering from that change ... and the
warnings are there to help other folk from seeing the same issue
in other contexts.

- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/16] Switch to operating with pid_numbers instead of pids

2007-07-24 Thread sukadev

Pavel Emelianov [EMAIL PROTECTED] wrote:
| Make alloc_pid() initialize pid_numbers and hash them
| into the hashtable, not the struct pid itself.
| 
| Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]>
| 
| ---
| 
|  pid.c |   47 +--
|  1 files changed, 33 insertions(+), 14 deletions(-)
| 
| --- ./kernel/pid.c.ve12   2007-07-05 11:06:41.0 +0400
| +++ ./kernel/pid.c2007-07-05 11:08:23.0 +0400
| @@ -28,8 +28,10 @@
|  #include 
|  #include 
|  #include 
| +#include 
| 
| -#define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)
| +#define pid_hashfn(nr, ns)   \
| + hash_long((unsigned long)nr + (unsigned long)ns, pidhash_shift)
|  static struct hlist_head *pid_hash;
|  static int pidhash_shift;
|  struct pid init_struct_pid = INIT_STRUCT_PID;
| @@ -194,7 +198,7 @@ fastcall void put_pid(struct pid *pid)
|   if (!pid)
|   return;
| 
| - ns = pid->numbers[0].ns;
| + ns = pid->numbers[pid->level].ns;
|   if ((atomic_read(>count) == 1) ||
|atomic_dec_and_test(>count))
|   kmem_cache_free(ns->pid_cachep, pid);
| @@ -210,13 +214,17 @@ static void delayed_put_pid(struct rcu_h
|  fastcall void free_pid(struct pid *pid)
|  {
|   /* We can be called with write_lock_irq(_lock) held */
| + int i;
|   unsigned long flags;
| 
|   spin_lock_irqsave(_lock, flags);
| - hlist_del_rcu(>pid_chain);
| + for (i = 0; i <= pid->level; i++)
| + hlist_del_rcu(>numbers[i].pid_chain);
|   spin_unlock_irqrestore(_lock, flags);
| 
| - free_pidmap(_pid_ns, pid->nr);
| + for (i = 0; i <= pid->level; i++)
| + free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr);
| +
|   call_rcu(>rcu, delayed_put_pid);
|  }
| 
| @@ -224,30 +232,43 @@ struct pid *alloc_pid(struct pid_namespa
|  {
|   struct pid *pid;
|   enum pid_type type;
| - int nr = -1;
| + struct pid_namespace *ns;
| + int i, nr;
| 
| - pid = kmem_cache_alloc(init_pid_ns.pid_cachep, GFP_KERNEL);
| + pid = kmem_cache_alloc(pid_ns->pid_cachep, GFP_KERNEL);
|   if (!pid)
|   goto out;
| 
| - nr = alloc_pidmap(current->nsproxy->pid_ns);
| - if (nr < 0)
| - goto out_free;
| + ns = pid_ns;
| + for (i = pid_ns->level; i >= 0; i--) {
| + nr = alloc_pidmap(ns);
| + if (nr < 0)
| + goto out_free;

If pid_ns->level is say 3 and alloc_pidmap() succeeds when i=0,1
and fails when i=2, we would try to free_pidmap() even from 
pid->pid_number[2].pid_ns. This would incorrectly a)
drop reference count on that pid namespace, and incorrectly
increment pidmap->nr_free.

Should we use kmem_cache_zalloc() and check for a non-NULL pid_ns
before calling free_pidmap() below ?

| 
| + pid->numbers[i].nr = nr;
| + pid->numbers[i].ns = ns;
| + ns = ns->parent;
| + }
| +
| + pid->level = pid_ns->level;
|   atomic_set(>count, 1);
| - pid->nr = nr;
|   for (type = 0; type < PIDTYPE_MAX; ++type)
|   INIT_HLIST_HEAD(>tasks[type]);
| 
|   spin_lock_irq(_lock);
| - hlist_add_head_rcu(>pid_chain, _hash[pid_hashfn(pid->nr)]);
| + for (i = pid->level; i >= 0; i--)
| + hlist_add_head_rcu(>numbers[i].pid_chain,
| + _hash[pid_hashfn(pid->numbers[i].nr,
| + pid->numbers[i].ns)]);
|   spin_unlock_irq(_lock);
| -
|  out:
|   return pid;
| 
|  out_free:
| - kmem_cache_free(init_pid_ns.pid_cachep, pid);
| + for (i++; i <= pid->level; i++)
| + free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr);

i.e all pid->numbers[] may not be initialized here right ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: crash with 2.6.22.1 crash:ll_rw_blk.c blk_remove_plug()

2007-07-24 Thread Satyam Sharma

On 7/23/07, Jens Axboe <[EMAIL PROTECTED]> wrote:

On Sun, Jul 22 2007, Satyam Sharma wrote:
> Hi Walter,
>
> Thanks for reporting this.
>
> On 7/22/07, walter harms <[EMAIL PROTECTED]> wrote:
>> hello all,
>> on my asus notebook tm620 there is a crash with 2.6.22 and 2.6.21
>
> Did this happen when you were resuming from a suspend-to-ram/disk?
> [ I ask because I see swsusp in the trace below, linux-pm added to Cc: ]
>
>> 
>> Using IPI Shortcut mode
>> WARNING: at block/ll_rw_blk.c:1575 blk_remove_plug()
>>  [] blk_remove_plug+0x36/0x5a
>>  [] __generic_unplug_device+0x14/0x1f
>>  [] __make_request+0x39b/0x49c
>>  [] generic_make_request+0x228/0x255
>>  [] submit_bio+0xa5/0xac
>>  [] mempool_alloc+0x37/0xae
>>  [] submit+0xc2/0x11d
>>  [] bio_read_page+0x24/0x27
>>  [] swsusp_check+0x4f/0xaf
>>  [] software_resume+0x5f/0x108
>>  [] kernel_init+0xb0/0x212
>>  [] ret_from_fork+0x6/0x1c
>>  [] kernel_init+0x0/0x212
>>  [] kernel_init+0x0/0x212
>>  [] kernel_thread_helper+0x7/0x10
>>  ===
>
> Surprising, that's a WARN_ON(!irqs_disabled()) but IRQs are disabled
> alright on that codepath. OTOH, __make_request() is heavily goto-driven,
> uses the non-save/restore variants of spin_lock_irq, and does not even
> balance locks / unlocks for some error paths ... gaah.

__make_request() must be called from process context, hence
spin_lock_irq() is perfectly already and the fastest way to go. And of
course the locking is balanced! So please save your 'gaah's for code
you actually took the time to try and understand.

You're right, I didn't really look at that code for long (it even explicitly
comments about what's going with the locking in there!) sorry about
that.

[ Off-topic: BTW does every call to __make_request() end up in
blk_remove_plug()? Since you're explicitly making the assumption
that it *must* be called from process context (and hence the use of
the non-save/restore variants), you could consider putting a
WARN_ON(irqs_disabled()) over there, and perhaps a WARN_ON
(!spin_is_locked(queue_lock)) in blk_remove_plug() instead, and
other such similar functions that currently have the !irqs_disabled
check. This way you'd effectively cover _both_ the assertions,
and in appropriate places -- just a suggestion. ]

But it does look like unbalanced irq disable/enable calls. I'd guess in
the suspend/resume path. Obviously something more esoteric, since this
is the first such report for 2.6.22, so like some not-very-used driver
for instance.

Now that I do look at the codepath, it does seem surprising irqs were
not disabled there. There are a bunch of calls to _other_ functions
between the spin_lock_irq and the blk_remove_plug via
__generic_unplug_device that would also have complained about
!irqs_disabled.

Walter, does this happen reproducibly?

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20->2.6.21 - networking dies after random time

2007-07-24 Thread Thomas Gleixner

On Tue, 2007-07-24 at 22:04 +0200, Ingo Molnar wrote:
> Marcin, could you try the patch below too? [without having any other 
> patch applied.] It basically turns the critical section into an irqs-off 
> critical section and thus checks whether your problem is related to that 
> particular area of code.
> 

I read back on this thread and I think the problem is somewhere else:

delayed disable relies on the ability to re-trigger the interrupt in the
case that a real interrupt happens after the software disable was set.
In this case we actually disable the interrupt on the hardware level
_after_ it occurred.

On enable_irq, we need to re-trigger the interrupt. On i386 this relies
on a hardware resend mechanism (send_IPI_self()). 

Actually we only need the resend for edge type interrupts. Level type
interrupts come back once enable_irq() re-enables the interrupt line.

I assume that the interrupt in question is level triggered because it is
shared and above the legacy irqs 0-15:

17: 12   IO-APIC-fasteoi   eth1, eth0

Looking into the IO_APIC code, the resend via send_IPI_self() happens
unconditionally. So the resend is done for level and edge interrupts.
This makes the problem more mysterious.

The code in question lib8390.c does

disable_irq();
fiddle_with_the_network_card_hardware()
enable_irq();

The fiddle_with_the_network_card_hardware() might cause interrupts,
which are cleared in the same code path again,

Marcin found that when he disables the irq line on the hardware level
(removing the delayed disable) the card is kept alive.

So the difference is that we can get a resend on enable_irq, when an
interrupt happens during the time, where we are in the disabled region.

No idea how this affects the network card, as the code there must be
able to handle interrupts, which are not originated from the card due to
interrupt sharing.

Marcin, can you please try the patch below ? It's just a debugging aid
to gather some more data about that problem.

If the patch fixes the problem, then we should try to disable the resend
mechanism for not edge type irq lines on the irq_chip level (i.e. the
IOAPIC code)

Thanks,

tglx

--- linux-2.6.orig/kernel/irq/resend.c
+++ linux-2.6/kernel/irq/resend.c
@@ -62,6 +62,15 @@ void check_irq_resend(struct irq_desc *desc, unsigned int 
irq)
 */
desc->chip->enable(irq);

+   /*
+* Temporary hack to figure out more about the problem, which
+* is causing the ancient network cards to die.
+*/
+   if (desc->handle_irq != handle_edge_irq) {
+   printk(KERN_DEBUG "Skip resend for irq %u\n", irq);
+   return;
+   }
+
if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
desc->status = (status & ~IRQ_PENDING) | IRQ_REPLAY;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sysfs/udev broken in 2.6.23-rc1 [input, i2c, ...] (Was: sysfs/udev broken in latest git?)

2007-07-24 Thread Kay Sievers

On 7/24/07, Simon Arlott <[EMAIL PROTECTED]> wrote:

On 24/07/07 17:34, Kay Sievers wrote:
> On 7/24/07, Simon Arlott <[EMAIL PROTECTED]> wrote:
>> On 24/07/07 13:54, Cornelia Huck wrote:
>> > On Tue, 24 Jul 2007 11:20:02 +0200,
>> > "Kay Sievers" <[EMAIL PROTECTED]> wrote:
>> >
>> >> It looks fine to me. "device" links must never point to anything else
>> >> than a bus device.

While it's still true, for input we have special rules because the
"stacked class devices" existed only there. At least for
SYSFS_DEPRECATED, all input devices should have a "device" symlink
pointing to the bus-device.

>> > Hm, but then
>> > 1. The patch sneaks this check in (the old code only checked for
>> >dev->parent)
>> > 2. The code is rather inconsistent now, since none of the other code
>> >paths check for dev->parent->bus...

Yeah, that's true.

>> Removing the dev->parent->bus check fixes it:

Yes, let's remove the check, I will check now if we possibly need to
fix more than this or only the block-device patch.

Thanks,
Kay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK

2007-07-24 Thread Andrew Morton

On Tue, 24 Jul 2007 16:58:51 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > __GFP_COMP I'm not so sure about. 
> > drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like 
> > infiniband)
> > pass it into dma_alloc_coherent() which some architectures implement via 
> > slab.  umm,
> > arch/arm/mm/consistent.c is one such.
> 
> Should  drm_pci_alloc really aright in setting __GFP_COMP? 

I don't see what's special about that dma_alloc_coherent() call.

> dma_alloc_coherent does not set __GFP_COMP for other higher order allocs 
> and expects to be able to operate on the page structs indepedently. That 
> is not the case for a compound page.
> 
> Creates a really interesting case for SLAB. Slab did not use __GFP_COMP in 
> order to be able to allow the use page->private (No longer an issue since 
> the 2.6.22 cleanups and avoiding the use of page->private for the compound 
> head).
> 
> Now the __GFP_COMP flag is passed through for any higher order page alloc 
> (such as a kmalloc allocation > PAGE_SIZE). Then we may have allocated one 
> slab that is a compound page amoung others higher order pages allocated 
> without __GFP_COMP. May have caused rare and strange failures in 2.6.21 
> and earlier because of the concurrent page->private use in compound head 
> pages and arch pages.
> 
> SLUB will always use __GFP_COMP so the pages are consistent regardless if 
> __GFP_COMP is passed in or not.
> 
> The strange scenarios come about by expecting a page allocation when 
> sometimes we just substitute a slab alloc.
> 
> We could filter __GFP_COMP out to avoid the BUG()? Or deal with it on a 
> case by case basis?

Fix callers, I'd suggest.  There are a number of fishy-looking open-coded
usages of __GFP_COMP around the place.

It's a bit sad that some architectures are using slab for dma_alloc_coherent()
while others go to alloc_pages().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK

2007-07-24 Thread Christoph Lameter

On Tue, 24 Jul 2007, Andrew Morton wrote:

> __GFP_COMP I'm not so sure about. 
> drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like infiniband)
> pass it into dma_alloc_coherent() which some architectures implement via 
> slab.  umm,
> arch/arm/mm/consistent.c is one such.

Should  drm_pci_alloc really aright in setting __GFP_COMP? 
dma_alloc_coherent does not set __GFP_COMP for other higher order allocs 
and expects to be able to operate on the page structs indepedently. That 
is not the case for a compound page.

Creates a really interesting case for SLAB. Slab did not use __GFP_COMP in 
order to be able to allow the use page->private (No longer an issue since 
the 2.6.22 cleanups and avoiding the use of page->private for the compound 
head).

Now the __GFP_COMP flag is passed through for any higher order page alloc 
(such as a kmalloc allocation > PAGE_SIZE). Then we may have allocated one 
slab that is a compound page amoung others higher order pages allocated 
without __GFP_COMP. May have caused rare and strange failures in 2.6.21 
and earlier because of the concurrent page->private use in compound head 
pages and arch pages.

SLUB will always use __GFP_COMP so the pages are consistent regardless if 
__GFP_COMP is passed in or not.

The strange scenarios come about by expecting a page allocation when 
sometimes we just substitute a slab alloc.

We could filter __GFP_COMP out to avoid the BUG()? Or deal with it on a 
case by case basis?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in serial console on ia64 after 2.6.22

2007-07-24 Thread Yinghai Lu


IA64

Subject : Regression in serial console on ia64 after 2.6.22
References  : http://marc.info/?l=linux-ia64=118483645914066=2
Last known good : ?
Submitter   : Horms <[EMAIL PROTECTED]>
Caused-By   : Yinghai Lu <[EMAIL PROTECTED]>
  commit 18a8bd949d6adb311ea816125ff65050df1f3f6e
Handled-By  : ?
Status  : unknown


please test this patch.

YH


[PATCH] ia64: move machvec_init before parse_early_param

So ia64_mv is initialized before early console

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/ia64/kernel/machvec.c b/arch/ia64/kernel/machvec.c
index 13df337..a94feaa 100644
--- a/arch/ia64/kernel/machvec.c
+++ b/arch/ia64/kernel/machvec.c
@@ -14,12 +14,6 @@ struct ia64_machine_vector ia64_mv;
 EXPORT_SYMBOL(ia64_mv);
 
 static __initdata const char *mvec_name;
-static __init int setup_mvec(char *s)
-{
-   mvec_name = s;
-   return 0;
-}
-early_param("machvec", setup_mvec);
 
 static struct ia64_machine_vector * __init
 lookup_machvec (const char *name)
@@ -42,6 +36,10 @@ machvec_init (const char *name)
 
if (!name)
name = mvec_name ? mvec_name : acpi_get_sysname();
+
+   if (!mvec_name)
+   mvec_name = name;
+
mv = lookup_machvec(name);
if (!mv)
panic("generic kernel failed to find machine vector for"
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index cf06fe7..b06d7b7 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -481,6 +481,9 @@ int __init reserve_elfcorehdr(unsigned long *start, 
unsigned long *end)
 void __init
 setup_arch (char **cmdline_p)
 {
+#ifdef CONFIG_IA64_GENERIC
+   char *mvstr;
+#endif
unw_init();
 
ia64_patch_vtop((u64) __start___vtop_patchlist, (u64) 
__end___vtop_patchlist);
@@ -491,12 +494,15 @@ setup_arch (char **cmdline_p)
efi_init();
io_port_init();
 
-   parse_early_param();
-
 #ifdef CONFIG_IA64_GENERIC
-   machvec_init(NULL);
+   mvstr = strstr(*cmd_line_p, "machvec=")
+   if (mvstr)
+   mvstr = strchr(mvstr, '=') + 1;
+   machvec_init(mvstr);
 #endif
 
+   parse_early_param();
+
if (early_console_setup(*cmdline_p) == 0)
mark_bsp_online();
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinkpad ACPI

2007-07-24 Thread Steven

On Tue, 24 Jul 2007 17:19:17 -0500, YOSHIFUJI Hideaki / 吉藤英明 wrote:

> Linux 2.6.23-rc1 fails to power off my ThinkPad T42. Git-bisect told me
> that the following commit is to blame, and by reverting that commit, it
> works appropriately.

I have noted the same behavior on a Thinkpad 600X.

On the topic of Thinkpad ACPI, the following has occured twice (out of over
one hundred boots) in the last month during boot, once with a 2.6.21.5
kernel and once with 2.6.22.1.

Jul 23 06:51:31 celestial kernel: [ 204.882991] ACPI: Core revision 20070126
Jul 23 06:51:31 celestial kernel: [ 204.947102] ACPI: setting ELCR to 0a00 
(from 0800)
Jul 23 06:51:31 celestial kernel: [ 208.791654] ACPI Error (hwacpi-0142): 
Hardware did not change modes [20070126]
Jul 23 06:51:31 celestial kernel: [ 208.792229] ACPI Error (evxfevnt-0086): 
Could not transition to ACPI mode [20070126]
Jul 23 06:51:31 celestial kernel: [ 208.792806] ACPI Warning (utxface-0139): 
AcpiEnable failed [20070126]
Jul 23 06:51:31 celestial kernel: [ 208.793299] ACPI: Unable to enable ACPI

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Power Management framework proposal

2007-07-24 Thread Benjamin Herrenschmidt

On Tue, 2007-07-24 at 16:02 -0700, [EMAIL PROTECTED] wrote:
> 
> what requirements are needed? (I'm sure that there are others, but 
> hopefully it's possible to avoid requirements like 'the clock speed
> for 
> device A must be >X to allow device B to operate in mode Y') 

I had an idea a while ago, might still be in the pm list archives, of
exposing constraints as opaque bitmaps. The bits have defined meaning
for a given bus, but are opaque to the core.

The devices however, provide tables indicating to the core their list of
power states (with names) and their requirements in term of parent
states (using such bitmasks).

Thus, the core can resolve the dependency requirements without having to
know about the actual meaning of the states of the various busses.

Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-24 Thread Chris Friesen


Chris Snook wrote:

A fraction of *each* CPU, or a fraction of *total* CPU?  Per-cpu 
granularity doesn't make anything more fair.


Well, our current solution uses per-cpu weights, because our vendor 
couldn't get the load balancer working accurately enough.  Having 
per-cpu weights and cpu affinity gives acceptable results for the case 
where we're currently using it.


If the load balancer is good enough, per-system weights would be fine. 
It would have to play nicely with affinity though, in the case where it 
makes sense to lock tasks to particular cpus.


If I have two threads with the same priority, and two CPUs, the 
scheduler will put one on each CPU, and they'll run happily without any 
migration or balancing.


Sure.  Now add a third thread.  How often do you migrate?  Put another 
way, over what time quantum do we ensure fairness?


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1 sky2 boot crash in sky2_mac_intr

2007-07-24 Thread Michal Piotrowski


Hi Florian,

On 24/07/07, Florian Lohoff <[EMAIL PROTECTED]> wrote:

On Tue, Jul 24, 2007 at 09:50:08AM +0100, Stephen Hemminger wrote:
> The problem is related to power management. The PHY has a number of PCI 
configuration
> registers for power control, and the function of these changes based on the 
version and
> revision of the chip. The driver does work on older versions of the EC-U, in
> Fujitsu laptop's, it is just the new rev that is broken.
>
> The driver should probably fail smarter (by not loading) if the PHY isn't 
powered
> up correctly, but that doesn't help your problem.
>
> The vendor has provided me with documentation on many versions
> of the chip, but I don't have doc's on the lastest revision differences of 
the EC Ultra,
> so a proper solution is not easily available.  The best method for resolving 
this would
> be to first try the vendor driver version of sk98lin and see if that fixes 
it. If so,
> then it is easy to change sky2, to match the phy setup in the vendor driver.
> Another possibility is to look for places in sky2 driver where there are 
places
> that compare version/revision.
>
> The most likely bits that need to change are in PCI registers: 0x80, 0x84 and 
0x88
> You could also load the windows driver and dump PCI config space (with lspci 
from
> cygwin), and see what the settings are there.
>
> I am away from my office for a month, and therefore away from any sky2
> hardware for testing.

I'll try the above and keep you posted. The crash itself seems to be a
2.6.23-rc1 regression though. I never experienced this with 2.6.22-rc5
which i was running before.


Can you try to figure out what is causing this crash and then use git-bisect?

Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] extent mapped page cache

2007-07-24 Thread Chris Mason

On Tue, 24 Jul 2007 23:25:43 +0200
Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> On Tue, 2007-07-24 at 16:13 -0400, Trond Myklebust wrote:
> > On Tue, 2007-07-24 at 16:00 -0400, Chris Mason wrote:
> > > On Tue, 10 Jul 2007 17:03:26 -0400
> > > Chris Mason <[EMAIL PROTECTED]> wrote:
> > > 
> > > > This patch aims to demonstrate one way to replace buffer heads
> > > > with a few extent trees.  Buffer heads provide a few different
> > > > features:
> > > > 
> > > > 1) Mapping of logical file offset to blocks on disk
> > > > 2) Recording state (dirty, locked etc)
> > > > 3) Providing a mechanism to access sub-page sized blocks.
> > > > 
> > > > This patch covers #1 and #2, I'll start on #3 a little later
> > > > next week.
> > > > 
> > > Well, almost.  I decided to try out an rbtree instead of the
> > > radix, which turned out to be much faster.  Even though
> > > individual operations are slower, the rbtree was able to do many
> > > fewer ops to accomplish the same thing, especially for merging
> > > extents together.  It also uses much less ram.
> > 
> > The problem with an rbtree is that you can't use it together with
> > RCU to do lockless lookups. You can probably modify it to allocate
> > nodes dynamically (like the radix tree does) and thus make it
> > RCU-compatible, but then you risk losing the two main benefits that
> > you list above.

The tree is a critical part of the patch, but it is also the easiest to
rip out and replace.  Basically the code stores a range by inserting
an object at an index corresponding to the end of the range.

Then it does searches by looking forward from the start of the range.
More or less any tree that can search and return the first key >=
than the requested key will work.

So, I'd be happy to rip out the tree and replace with something else.
Going completely lockless will be tricky, its something that will deep
thought once the rest of the interface is sane.

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-24 Thread hui

On Tue, Jul 24, 2007 at 05:22:47PM -0400, Chris Snook wrote:
> Bill Huey (hui) wrote:
> Well, you need enough CPU time to meet your deadlines.  You need 
> pre-allocated memory, or to be able to guarantee that you can allocate 
> memory fast enough to meet your deadlines.  This principle extends to any 
> other shared resource, such as disk or network.  I'm being vague because 
> it's open-ended.  If a medical device fails to meet realtime guarantees 
> because the battery fails, the patient's family isn't going to care how 
> correct the software is.  Realtime engineering is hard.
...
> Actually, it's worse than merely an open problem.  A clairvoyant fair 
> scheduler with perfect future knowledge can underperform a heuristic fair 
> scheduler, because the heuristic scheduler can guess the future incorrectly 
> resulting in unfair but higher-throughput behavior.  This is a perfect 
> example of why we only try to be as fair as is beneficial.

I'm glad we agree on the above points. :)

It might be that there needs to be another more stiff policy than what goes
into SCHED_OTHER in that we also need a SCHED_ISO or something has more
strict rebalancing semantics for -rt applications, sort be a super SCHED_RR.
That's definitely needed and I don't see how the current CFS implementation
can deal with this properly even with numerical running averages, etc...
at this time.

SCHED_FIFO is another issue, but this actually more complicated than just
per cpu run queues in that a global priority analysis. I don't see how
CFS can deal with SCHED_FIFO efficiently without moving to a single run
queue. This is kind of a complicated problem with a significant set of
trade off to take into account (cpu binding, etc..)

>> Tong's previous trio patch is an attempt at resolving this using a generic
>> grouping mechanism and some constructive discussion should come of it.
>
> Sure, but it seems to me to be largely orthogonal to this patch.

It's based on the same kinds of ideas that he's been experimenting with in
Trio. I can't name a single other engineer that's posted to lkml recently
that has quite the depth of experience in this area than him. It would be
nice to facilitted/incorporate some his ideas or get him to and work on
something to this end that's suitable for inclusion in some tree some where.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm] Power Management framework proposal

2007-07-24 Thread Jerome Glisse

On 7/24/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

On Tue, 24 Jul 2007, Jerome Glisse wrote:

> On 7/23/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>>  On Mon, 23 Jul 2007, Igor Stoppa wrote:
>> >  again, HAL / OHM / Mobilin
>>
>>  I was trying to define the lower level interfaces that these tools need.
>>  today they can only know what is possible by reading the source code for
>>  each driver and implementing the driver-specific interfaces nessasary to
>>  set things, I was proposing a common interface that tools like this could
>>  use instead of requiring all the driver-specific knowledge.
>>
>>
>>  in a nutshell (and I know this is probably not detailed to be acceptable)
>>
>>  1. the software needs to know what the interconnects and dependancies
>>  between devices are (supposedly this is provided via sysfs)
>>
>>  2. the software needs to know what type of device this is (again,
>>  supposedly this is provided via sysfs)
>>
>>  3. the software needs to know what modes exist for a driver/piece of
>>  hardware. to make any decisions this infomation needs to provide some
>>  information about the capability of the mode and the power consumed in
>>  that mode. in addition there will need to be flags to indicate any
>>  special restrictions of a mode
>>
>>  4. the software needs to know the cost of switching from any mode to any
>>  other mode. since some transitions will interact with other devices
>>  there will need to be flags to indicate such requirements for specific
>>  transitions.
>>
>>  5. the software needs to be able to find out what mode a device is in.
>>
>>  6. the software needs to be able to tell the driver to switch to a
>>  different mode (I think it would be a very good thing if going to a
>>  particular mode was always the same command, no matter what mode it is
>>  currently in)
>>
>>  7. the software needs to figure out the desire of the user.
>>
>>  my proposal was addressing items #3-#6. it isn't trying to decide what to
>>  do, simply to allow the software that _is_ trying to decide what to do a
>>  way to find out what it can do.
>>
>>  David Lang
>
> I believe a central place where user can set/change hw state to save
> power or to increase computational power is definitely a goal to pursue.
> But i truly think that the OHM approach is the best one ie using plugins
> so that one can make a plugin specific for each device. The point is that
> i believe there is no way to do an abstract interface for this and trying to
> do so will endup doing ugly code and any interface would fail to encompass
> all possible tweak that might exist for all devices.

will each plugin have it's own interface? or will you have one interface
to access the plugins and then the plugins do things behind the scenes?

I'll bet that the API for the plugins is common, and if so then it could
be similar to the API that I suggested.

I take here ohm as a reference (this come from my limited understanding of
this daemon so there might be inaccuracy) driver export through HAL
there power management tunning capacity, Then an ohm plugin would use
HAL to give a higher
view of this capacity and also manage policy, preference, permission, ...

Last consumer in power management food chain would be an user interface which
will communicate with ohm (and with all ohm plugin) so desktop writter (gnome,
kde, ...) can write some kind of power management center where each ohm plugin
can have its own panel. So in the end the user got one place to do all its
power management which is the goal i think you are trying to aim.

> For instance on graphics card you could do the following (maybe more):
> -change GPU clock
> -change memory clock
> -disable part of engine
> -disable unit
> i truly don't think you can make a common interface for all this, more
> over there might be constraint on how you can change things (GPU &
> memory clock might need to follow a given ratio). So you definitely
> need knowledge in the user space program to handle this.

sure you can, just enumerate all the options the driver writer wants to
offer as options. yes this could be a lengthy list, so what?

My point was that your interface by trying to fit square pegs into round hole
will fail to expose all subtility of each device which might in the end bring
to wrong power management decision. So i believe we can't sum up
power management to list of mode whose attribute are power consumption
& capacity.

And there is no way to design an abstraction given that all hw we will have
to deal with are too much different and do not follow any standard things
(beside ACPI there is other way to save power brightness, gpu/memory
clock, pll, ...) so i don't see how one might give a common view of things
which are fundamentally different in how they affect consumption (same end
result with many different paths leading to it).

best,
Jerome Glisse
-
To unsubscribe from this list: send the line

Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK

2007-07-24 Thread Andrew Morton

On Tue, 24 Jul 2007 16:00:32 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > I think I'll duck this for now.  Otherwise I have a suspicion that I'll
> > be the first person to run it and I'm too old for such excitement.
> 
> I always had the suspicion that you have some magical script 
> which will immediately tell you that a patch is not working ;-)

sort of a defensive crouch.

> Works fine on x86_64 (on top of the ctor cleanup patchset) and passes the 
> kernel build test but then there may be creatively designed drivers and 
> such that pass these flags to the slab allocators which will now BUG.

__GFP_COLD looks OK.

__GFP_COMP I'm not so sure about. 
drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like infiniband)
pass it into dma_alloc_coherent() which some architectures implement via slab.  
umm,
arch/arm/mm/consistent.c is one such.

__GFP_MOVABLE looks OK.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is PIE randomization breaking klibc binaries?

2007-07-24 Thread Ulrich Kunitz

On 07-07-24 15:45 H. Peter Anvin wrote:

> Chuck Ebbert wrote:
> >
> >Okay, I tested with Fedora on x86_64 and it worked there too.
> >(Not that that proves much.)
> >
> >Did you capture any of the error messages, like the address
> >of the segfault?
> >
> 
> FWIW, on x86-64, this should show up in dmesg.
> 
>   -hpa

Guys, this is at boot time and most of the binaries don't work.
However at the end busybox is called and then there is a shell,
where I can call the binaries and force the segmentation
violation. Pencil and paper work usually. But right now I don't
have the broken kernel anymore and it's 1 am here. Wait for
tomorrow.

-- 
Uli Kunitz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-24 Thread Satyam Sharma

Hi Lars,

On 7/24/07, Lars Ellenberg <[EMAIL PROTECTED]> wrote:

On Mon, Jul 23, 2007 at 07:10:58PM +0530, Satyam Sharma wrote:
> On 7/23/07, Lars Ellenberg <[EMAIL PROTECTED]> wrote:
> >On Sun, Jul 22, 2007 at 09:32:02PM -0400, Kyle Moffett wrote:
> >[...]
> >> Don't use signals between kernel threads, use proper primitives like
> >> notifiers and waitqueues, which means you should also probably switch
> >away
> >> from kernel_thread() to the kthread_*() APIs.  Also you should fix this
> >> FIXME or remove it if it no longer applies:-D.
> >
> >right.
> >but how to I tell a network thread in tcp_recvmsg to stop early,
> >without using signals?
>
> Yup, kthreads API cannot handle (properly stop) kernel threads that want
> to sleep on possibly-blocking-forever-till-signalled-functions such as
> tcp_recvmsg or skb_recv_datagram etc etc.
>
> There are two workarounds:
> 1. Use sk_rcvtimeo and related while-continue logic
> 2. force_sig(SIGKILL) to your kernel thread just before kthread_stop
>   (note that you don't need to allow / write code to handle / etc signals
>   in your kthread code -- force_sig will work automatically)

this is not only at stop time.
for example our "drbd_asender" thread
does receive as well as send,

That's normal -- in fact it would've been surprising if your kthread only
did recvs but no sends!

But where does the "send" come into the picture over here -- a send
won't block forever, so I don't foresee any issues whatsoever w.r.t.
kthreads conversion for that. [ BTW I hope you're *not* using any
signals-based interface for your kernel thread _at all_. Kthreads
disallow (ignore) all signals by default, as they should, and you really
shouldn't need to write any logic to handle or do-certain-things-on-seeing
a signal in a well designed kernel thread. ]

and the sending
latency is crucial to performance, while the recv
will not timeout for the next few seconds.

Again, I don't see what sending latency has to do with a kernel_thread
to kthread conversion. Or with signals, for that matter. Anyway, as
Kyle Moffett mentioned elsewhere, you could probably look at other
examples (say cifs_demultiplexer_thread() in fs/cifs/connect.c).

[ I didn't really want to give that example, because I get a nervous
breakdown when looking at that code myself, and would actively
like to save other fellow developers from a similar fate. To know
what I'm talking about, set your xterm to display 40 rows, and
then look at the line numbers 3139-3218 in that file, especially
3190-3212. Yes, what you see there is a map of Sulawesi [1]
subliminally hidden in Linux kernel code :-) ]

Anyway, cifs_demultiplexer_thread() is just your normal kthread that:

(1) Ignores all signals
(2) Calls perma-blocking-till-signalled functions such as tcp_recvmsg
   (via kernel_recvmsg)
(3) Calls send-to-socket kind of functions

Hence, it could get into trouble when the umount(2) code wants to stop
it with kthread_stop() and it happens to be blocked in tcp_recvmsg()
with noblock = 0 (hence sk_rcvtimeo == MAX_SCHEDULE_TIMEOUT), thus
would handle the wake_up_process() internally, and not break out, hence
not check kthread_should_stop() which it should -- all this ensuring that
the kthread never gets killed, kthread_stop() hangs, and the umount(2)
from userspace never returns ...

But they've solved it as follows (as I suggested earlier):

(1) First, set sock->sk_rcvtimeo to some "magical value" in your code
   that sets up the socket params after socket->proto_ops->connect().
   See ipv4_connect(), f.e. in CIFS they've set it up to 7 seconds. But
   that's arbitrarily chosen -- this'll ensure your tcp_recvmsg() isn't
   perma-blocking in the first place, but will unblock/return every 7 secs,
   and thus get a chance to check kthread_should_stop().

(2) From the code that wants to kill/stop the kthread (module exit, or
   umount(2) most probably), just ensure you make a call to force_sig()
   before kthread_stop() on that kthread -- see cifs_umount() in the
   same file. This'll ensure that even if the kthread is currently sleeping
   in tcp_recvmsg(), it'll be signalled to break out from there, and thus
   check kthread_should_stop().

(3) Note that not a single line of code needs to be written extra in the
   kthread itself for this to work -- nothing to allow / handle signals ...

Just this, should be enough for a smooth conversion to kthreads, IMHO.

> >> +/* THINK maybe we actually want to use the default "event/%s" worker
> >threads
> >> + * or similar in linux 2.6, which uses per cpu data and threads.
> >> + *
> >> + * To be general, this might need a spin_lock member.
> >> + * For now, please use the mdev->req_lock to protect list_head,
> >> + * see drbd_queue_work below.
> >> + */
> >> +struct drbd_work_queue {
> >> +   struct list_head q;
> >> +   struct semaphore s; /* producers up it, worker down()s it */
> >> +   spinlock_t q_lock;  /* to protect the list. */
> >> +};
> >>
> >> Umm, how about fixing this

Re: [patch 2.6.23-rc1] dma_free_coherent() needs irqs enabled (sigh)

2007-07-24 Thread Russell King

On Tue, Jul 24, 2007 at 04:08:11PM -0700, David Brownell wrote:
> On Tuesday 24 July 2007, Russell King wrote:
> > On Tue, Jul 24, 2007 at 02:29:05PM -0700, David Brownell wrote:
> > > On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
> > > call context requirement:  unlike its dma_alloc_coherent() sibling, it
> > > may not be called with IRQs disabled.  (This was new behavior on ARM as
> > > of late 2006, caused by ARM SMP updates.)
> > 
> > I think you got the year wrong:
> > 
> > 5edf71ae (Russell King  2005-11-25 15:52:51 + 364)  
> > WARN_ON(irqs_disabled());
> > 
> > which is due to this commit:
> > 
> > [ARM] Do not call flush_tlb_kernel_range() with IRQs disabled.
> 
> This little "to do" list item has been sitting in my mailbox way
> too long then.  Certainly since it was fair to say "last year"!  ;)

Are you intentionally not reading what I said?

> > Signed-off-by: Russell King <[EMAIL PROTECTED]>
> 
> Thanks...

That was part of the commit I quoted, not an endorsement of your patch,
though I think it does deserve an:

Acked-by: Russell King <[EMAIL PROTECTED]>

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.23-rc1] dma_free_coherent() needs irqs enabled (sigh)

2007-07-24 Thread David Brownell

On Tuesday 24 July 2007, Russell King wrote:
> On Tue, Jul 24, 2007 at 02:29:05PM -0700, David Brownell wrote:
> > On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
> > call context requirement:  unlike its dma_alloc_coherent() sibling, it
> > may not be called with IRQs disabled.  (This was new behavior on ARM as
> > of late 2006, caused by ARM SMP updates.)
> 
> I think you got the year wrong:
> 
> 5edf71ae (Russell King  2005-11-25 15:52:51 + 364)  
> WARN_ON(irqs_disabled());
> 
> which is due to this commit:
> 
> [ARM] Do not call flush_tlb_kernel_range() with IRQs disabled.

This little "to do" list item has been sitting in my mailbox way
too long then.  Certainly since it was fair to say "last year"!  ;)

Of course, 2.6.23-rc1 also merges all the gadget API updates I
included to cope with this annoyance.  So with any luck the issue
will now finally have been properly whupped.


> Signed-off-by: Russell King <[EMAIL PROTECTED]>

Thanks...

 
> > Since it looks like that restriction won't be removed, this patch changes
> > the definition of the API to include that requirement.
> 
> The PCI DMA-mapping API had this restriction.  For some reason, this
> restriction was not carried forward into the DMA-API.  Unfortunately
> the restriction can not be removed without causing the problems
> described in the commit which introduced it.

Right, I noticed that.

- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pata_hpt37x: Fix 2.6.22 clock PLL regression

2007-07-24 Thread Linus Torvalds



On Tue, 24 Jul 2007, Alan Cox wrote:
> 
>   Just one version of Linux ago
>   The PLL code broke - oh no!
>   But set the right mode
>   And fix up the code
>   Makes the PLL timing sync go

Alan, I'm getting a bit worried about you.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Power Management framework proposal

2007-07-24 Thread david


On Wed, 25 Jul 2007, Benjamin Herrenschmidt wrote:


On Tue, 2007-07-24 at 13:14 -0700, [EMAIL PROTECTED] wrote:

I think we need a set of constraints that trickle down the power

tree

and limit what a given driver can do locally.


what sort of contraints are you thinking of?


A parent power state defines what states children can be in. For
example. A way to express those dependencies would be nice. Or
alternatiely, the power state of all the children defines the power
state a parent can go in automatically.


Ok, I see tow things here.

1. do you really want to try and propogate things like this from one to 
the other, or would it be good enough to flag the issue and let the 
software selecting the modes implement this contraint?


2. how can you standardize the requirements?

at the very least you have

for this mode all children must be off

for this mode all children must be in a mode that includes a 'suspended' 
flag (this could be made implicit by saying that you must suspend children 
before parents) and then just flagging the 'suspended, but not off' modes)


what requirements are needed? (I'm sure that there are others, but 
hopefully it's possible to avoid requirements like 'the clock speed for 
device A must be >X to allow device B to operate in mode Y')


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-24 Thread Duncan Sands

Hi Mikie,

> Do you have any news regarding my case of slow transfers via
> Speedtouch USB modem on linux ?

I found my old speedtouch modem and tested here.  I got 2.1 Mbaud
bulk downspeed, and 3 Mbaud isoc downspeed.  This last is half the
speed my line supports, so something is wrong [*].  Unfortunately
I'm not very motivated to try to find out what, because I don't
use this modem myself anymore.  It looks like someone needs to do
some more reverse engineering work on the windows driver.

Ciao,

Duncan.

[*] I got the same numbers the last time I tested isoc support,
but at that time 3 Mbaud was slightly less than the maximum speed
of my line, which explains why I didn't realize that there is a
problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK

2007-07-24 Thread Christoph Lameter

On Tue, 24 Jul 2007, Andrew Morton wrote:

> I think I'll duck this for now.  Otherwise I have a suspicion that I'll
> be the first person to run it and I'm too old for such excitement.

I always had the suspicion that you have some magical script 
which will immediately tell you that a patch is not working ;-)

Works fine on x86_64 (on top of the ctor cleanup patchset) and passes the 
kernel build test but then there may be creatively designed drivers and 
such that pass these flags to the slab allocators which will now BUG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patches for REALLY TINY 386 kernels

2007-07-24 Thread Adrian Bunk

On Tue, Jul 24, 2007 at 01:50:35PM -0700, Yinghai Lu wrote:
> On 7/24/07, Helge Hafting <[EMAIL PROTECTED]> wrote:
>> Andi Kleen wrote:
>> >> Some people are putting Linux kernels in the "BIOS" (i.e. ROM chip) 
>> when
>> >> using LinuxBIOS (www.linuxbios.org). It _does_ make a lot of difference
>> >> there how big the kernel is. At the moment you can't do that with
>> >> anything smaller than a 1 MB chip. But if people could use 512 KB chips
>> >> because the kernel is small enough that would sure be a great thing.
>> >>
>> >
>> > I'm sure it would be possibel to save a lot of text size. But I don't
>> > think removing the relatively small CPUID code is the right way.
>> > That is just a big maintenance issue for little gain.
>> >
>> Well - anyone compiling linux for BIOS usage is targetting
>> a single machine.  So an ability to target a single machine is useful,
>> i.e. run the CPUID at compile-time, put the answer in a constant/macro,
>> let the optimizer prune the alternatives. :-)
>
> we are using AMD64 + LinuxBIOS + Kernel (without acpi) + kexec to load
> final kernel.
> So we can use drivers in kernel for any media (SCSI, SATA, IB,...),
> not like EFI need every driver re-porting. and We could use KVM in
> kernel to load other OS if needed.
>
> The problem is Kernel is getting bigger and bigger. and old Tiny
> kernel is stopping at 2.6.18...
>...

Please send:
- the .config for the last kernel small enough
- your size limit
- your gcc version
and I'll look at this.

> YH

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is PIE randomization breaking klibc binaries?

2007-07-24 Thread H. Peter Anvin


Chuck Ebbert wrote:


Okay, I tested with Fedora on x86_64 and it worked there too.
(Not that that proves much.)

Did you capture any of the error messages, like the address
of the segfault?



FWIW, on x86-64, this should show up in dmesg.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Patches for REALLY TINY 386 kernels

2007-07-24 Thread Willy Tarreau

On Wed, Jul 18, 2007 at 08:55:50AM -0700, H. Peter Anvin wrote:
> Andi Kleen wrote:
> > 
> >> Already with these patches I can compile a zImage kernel that is 450kb
> >> large (890kb decompressed)
> > 
> > The important part is not how big the vmlinux is, but how much
> > memory is actually used after boot. 
> > 
> > I expect concentrating some of the dynamic data structures would
> > be more fruitful in fact.
> > 
> 
> Well, how big the vmlinux file is matters if it doesn't fit in memory
> with enough time to get to the phase where it is dumping the init
> sections.  *If that is not the issue*, then axing stuff like CPUID is a
> major lose in terms of code maintainability for zero gain.

Not only that, but the size of the vmlinux matters when you have limited
flash memory to put it on. Having packaged single floppy-based firewalls
for a few years, I can assure you that even one kB sometimes matters!

>   -hpa

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is PIE randomization breaking klibc binaries?

2007-07-24 Thread Chuck Ebbert

On 07/24/2007 06:00 PM, Ulrich Kunitz wrote:
> On 07-07-24 16:57 Chuck Ebbert wrote:
> 
>>> $ strace ./cat
>>> execve("./cat", ["./cat"], [/* 55 vars */]) = -1 ENOENT (No such file or 
>>> directory)
>>> ...
> 
> Chuck, my binaries run always into a segmentation violation. So
> ENOENT is not the issue. (Notify it was on an x86-64.)
> 

Okay, I tested with Fedora on x86_64 and it worked there too.
(Not that that proves much.)

Did you capture any of the error messages, like the address
of the segfault?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] lguest: documentation pt I: Preparation

2007-07-24 Thread Rusty Russell

On Tue, 2007-07-24 at 13:04 +0100, Alan Cox wrote:
> Dear Rusty I think that we know
> Your code has good things to show
> But an unreliable guide
> To the poetic aside
> Would probably steal the show

That and your (slightly dated?) mm documentation were awesome.  But can
we stop now?

Please?
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-24 Thread Benjamin Herrenschmidt

On Tue, 2007-07-24 at 17:55 -0400, Trond Myklebust wrote:
> 
> If you want to use bitops as spinlocks you should rather be using
> . That also does the right thing w.r.t.
> pre-emption and sparse locking annotations.

Heh, I didn't know about those... A bit annoying that I can't override
them in the arch, I might be able to save a barrier or two here. Our
test_and_set_bit() contains both barriers for lock and unlock semantics
to cope with all kind of abuses, but your bit_spinlock obviously doesn't
need that.

Cheers,
Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/8] dm: Fix workqueue leak for raid5

2007-07-24 Thread Dan Williams


On 7/24/07, Dmitry Monakhov <[EMAIL PROTECTED]> wrote:

Signed-off-by: Dmitry Monakhov <[EMAIL PROTECTED]>
---
 drivers/md/raid5.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0f30826..79dd2c7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4985,6 +4985,8 @@ static int run(mddev_t *mddev)
 abort:
if (conf) {
print_raid5_conf(conf);
+   if (conf->workqueue)
+   destroy_workqueue(conf->workqueue);
safe_put_page(conf->spare_page);
kfree(conf->disks);
kfree(conf->stripe_hashtbl);
--

I assume this patch is against -mm.  I will fold it into:

git://lost.foo-projects.org/~dwillia2/git/iop md-for-linus

Thanks,
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lksctp-developers] __unsafe() usage

2007-07-24 Thread Rusty Russell

On Tue, 2007-07-24 at 09:05 -0400, Vlad Yasevich wrote:
> 
> Please don't remove module_exit point for SCTP.  Simply removing the 
> __unsafe() call will
> be sufficient.
> 
> The code has recently been cleaned up to allow safe unloading and I working 
> on final
> cleanups.  It currently works correctly with forced unloading.

Thanks Vlad!

I think that's everyone...

Cheers,
Rusty.
===
Remove "unsafe" from module struct

Adrian Bunk points out that "unsafe" was used to mark modules touched by
the deprecated MOD_INC_USE_COUNT interface, which has long gone.  It's
time to remove the member from the module structure, as well.

If you want a module which can't unload, don't register an exit
function.

(Vlad Yasevich says SCTP is now safe to unload, so just remove the
__unsafe there).

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>
Acked-by: Shannon Nelson <[EMAIL PROTECTED]>
Acked-by: Dan Williams <[EMAIL PROTECTED]>
Cc: Vlad Yasevich <[EMAIL PROTECTED]>

diff -r d7af727512fd drivers/dma/ioatdma.c
--- a/drivers/dma/ioatdma.c Tue Jul 24 08:30:05 2007 +1000
+++ b/drivers/dma/ioatdma.c Tue Jul 24 09:11:11 2007 +1000
@@ -811,18 +811,17 @@ MODULE_AUTHOR("Intel Corporation");
 
 static int __init ioat_init_module(void)
 {
-   /* it's currently unsafe to unload this module */
-   /* if forced, worst case is that rmmod hangs */
-   __unsafe(THIS_MODULE);
-
return pci_register_driver(_pci_driver);
 }
 
 module_init(ioat_init_module);
 
+/* it's currently unsafe to unload this module */
+#if 0
 static void __exit ioat_exit_module(void)
 {
pci_unregister_driver(_pci_driver);
 }
 
 module_exit(ioat_exit_module);
+#endif
diff -r d7af727512fd drivers/dma/iop-adma.c
--- a/drivers/dma/iop-adma.cTue Jul 24 08:30:05 2007 +1000
+++ b/drivers/dma/iop-adma.cTue Jul 24 09:11:30 2007 +1000
@@ -1446,21 +1446,20 @@ static struct platform_driver iop_adma_d
 
 static int __init iop_adma_init (void)
 {
-   /* it's currently unsafe to unload this module */
-   /* if forced, worst case is that rmmod hangs */
-   __unsafe(THIS_MODULE);
-
return platform_driver_register(_adma_driver);
 }
 
+/* it's currently unsafe to unload this module */
+#if 0
 static void __exit iop_adma_exit (void)
 {
platform_driver_unregister(_adma_driver);
return;
 }
+module_exit(iop_adma_exit);
+#endif
 
 module_init(iop_adma_init);
-module_exit(iop_adma_exit);
 
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("IOP ADMA Engine Driver");
diff -r d7af727512fd include/linux/module.h
--- a/include/linux/module.hTue Jul 24 08:30:05 2007 +1000
+++ b/include/linux/module.hTue Jul 24 09:00:19 2007 +1000
@@ -312,9 +312,6 @@ struct module
/* Arch-specific module values */
struct mod_arch_specific arch;
 
-   /* Am I unsafe to unload? */
-   int unsafe;
-
unsigned int taints;/* same bits as kernel:tainted */
 
 #ifdef CONFIG_GENERIC_BUG
@@ -441,16 +438,6 @@ static inline void __module_get(struct m
__mod ? __mod->name : "kernel"; \
 })
 
-#define __unsafe(mod)   \
-do {\
-   if (mod && !(mod)->unsafe) { \
-   printk(KERN_WARNING  \
-  "Module %s cannot be unloaded due to unsafe usage in" \
-  " %s:%u\n", (mod)->name, __FILE__, __LINE__); \
-   (mod)->unsafe = 1;   \
-   }\
-} while(0)
-
 /* For kallsyms to ask for address resolution.  NULL means not found. */
 const char *module_address_lookup(unsigned long addr,
  unsigned long *symbolsize,
@@ -518,8 +505,6 @@ static inline void module_put(struct mod
 
 #define module_name(mod) "kernel"
 
-#define __unsafe(mod)
-
 /* For kallsyms to ask for address resolution.  NULL means not found. */
 static inline const char *module_address_lookup(unsigned long addr,
unsigned long *symbolsize,
diff -r d7af727512fd kernel/module.c
--- a/kernel/module.c   Tue Jul 24 08:30:05 2007 +1000
+++ b/kernel/module.c   Tue Jul 24 09:00:58 2007 +1000
@@ -692,8 +692,7 @@ sys_delete_module(const char __user *nam
}
 
/* If it has an init func, it must have an exit func to unload */
-   if ((mod->init != NULL && mod->exit == NULL)
-   || mod->unsafe) {
+   if (mod->init && !mod->exit) {
forced = try_force_unload(flags);
if (!forced) {
/* This module can't be removed */
@@ -739,11 +738,6 @@ static void print_unload_info(struct seq
list_for_each_entry(use, >modules_which_use_me, list) {
printed_something = 1;

Re: [2/2] 2.6.23-rc1: known regressions

2007-07-24 Thread Tilman Schmidt

Am 23.07.2007 11:47 schrieb Michal Piotrowski:

> Virtualization
> 
> Subject : 2.6.22-git17 boot failure (XEN)
> References  : http://lkml.org/lkml/2007/7/22/266
> Last known good : ?
> Submitter   : Tilman Schmidt <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

"Not a regression."

With the help of Jeremy Fitzhardinge, Andi Kleen and Olaf Hering,
I was able to isolate the cause. It lies outside the kernel, in
the distribution's init script. So the entry can be dropped.

Thanks,
Tilman

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature

[2.6.23-rc1 REGRESSION] ThinkPad T42 poweroff failure by "PM: Introduce pm_power_off_prepare"

2007-07-24 Thread YOSHIFUJI Hideaki / 吉藤英明

Hello.

Linux 2.6.23-rc1 fails to power off my ThinkPad T42.
Git-bisect told me that the following commit is to blame,
and by reverting that commit, it works appropriately.

Regards,

--yoshfuji

bd804eba1c8597cbb7cd5a5f9fe886aae16a079a is first bad commit
commit bd804eba1c8597cbb7cd5a5f9fe886aae16a079a
Author: Rafael J. Wysocki <[EMAIL PROTECTED]>
Date:   Thu Jul 19 01:47:40 2007 -0700

PM: Introduce pm_power_off_prepare

Introduce the pm_power_off_prepare() callback that can be registered by the
interested platforms in analogy with pm_idle() and pm_power_off(), used for
preparing the system to power off (needed by ACPI).

This allows us to drop acpi_sysclass and device_acpi that are only defined 
in
order to register the ACPI power off preparation callback, which is needed 
by
pm_power_off() registered in a much different way.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
Acked-by: Pavel Machek <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 624870eb14bf9841fa2dca2cf13cc4c9a0479005 
af79f843f3383bbecaed84d493926939cf0e1c12 M  drivers
:04 04 9b28a21970668ce133916bbe8d8fd4a61bce23d7 
80fc84d7982369205dcf94029e3958c90db14bf0 M  include
:04 04 9ce5c8b5d3f87c121b2f7bc6e02bc814648a2739 
2e2e1468dfa0db9dee5bd204fd3f802a975a6454 M  kernel

-- 
YOSHIFUJI Hideaki @ USAGI Project  <[EMAIL PROTECTED]>
GPG-FP  : 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Towards eliminating the freezer

2007-07-24 Thread Alan Stern

On Tue, 24 Jul 2007, Rafael J. Wysocki wrote:

> > Then device_suspend() can be simplified:
> > 
> > int device_suspend(pm_message_t state)
> > {
> > int error = 0;
> > 
> > might_sleep();
> > list_for_each_entry_reverse(dev, _locked, power.entry) {
> > error = suspend_device(dev, state);
> > 
> > if (error) {
> > printk(KERN_ERR "Could not suspend device %s: "
> > "error %d%s\n",
> > kobject_name(>kobj), error,
> > error == -EAGAIN ? " (please convert to 
> > suspend_late)" : "");
> > break;
> > }
> > list_move(>power.entry, _off);
> 
> Is that safe with list_for_each_entry_reverse?

No.  I guess it'll have to resemble the other code.

> Yes, that looks fine. 
> 
> So, who's writing the patch? ;-)

I can do it.  You haven't made any changes to this part of the code, 
have you?  My work tends to be based on Linus's tree, not -mm.

Something to watch out for: With all the extra locking, we run the risk
of blocking the keventd workqueue.  This may or may not matter, but to
be safe perhaps there should be a new general-purpose workqueue which
_expects_ to block (or freeze) during suspends.  Any work routine that 
involves adding or removing a device should go on the new workqueue.

> > Incidentally, what is dpm_mtx for?  It doesn't seem to do anything 
> > useful.  Is it a relic of the former runtime PM support?
> 
> I think so.  IMO it can be removed.
> 
> I also think it would be nicer to have all of the functions in
> drivers/base/power/{main|suspend|resume}.c moved to one file.

Yes, they are all similar enough that there isn't much point keeping 
them separate.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK

2007-07-24 Thread Andrew Morton

On Tue, 24 Jul 2007 12:36:59 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > > __GFP_MOVABLE The movability of a slab is determined by the
> > >   options specified at kmem_cache_create time. If this is
> > >   specified at kmalloc time then we will have some random
> > >   slabs movable and others not. 
> > 
> > Yes, they seem inappropriate.  Especially the first two.
> 
> The third one would randomize __GFP_MOVABLE allocs from the page allocator 
> since one __GFP_MOVABLE alloc may allocate a slab that is then used for 
> !__GFP_MOVABLE allocs.
> 
> Maybe something like this? Note that we may get into some churn here 
> since slab allocations that any of these flags will BUG.
> 
> 
> 
> GFP_LEVEL_MASK: Remove __GFP_COLD, __GFP_COMP and __GFPMOVABLE
> 
> Add an explanation for the GFP_LEVEL_MASK and remove the flags
> that should not be passed through derived allocators.
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

I think I'll duck this for now.  Otherwise I have a suspicion that I'll
be the first person to run it and I'm too old for such excitement.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.23-rc1] dma_free_coherent() needs irqs enabled (sigh)

2007-07-24 Thread Russell King

On Tue, Jul 24, 2007 at 02:29:05PM -0700, David Brownell wrote:
> On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
> call context requirement:  unlike its dma_alloc_coherent() sibling, it
> may not be called with IRQs disabled.  (This was new behavior on ARM as
> of late 2006, caused by ARM SMP updates.)

I think you got the year wrong:

5edf71ae (Russell King  2005-11-25 15:52:51 + 364)  
WARN_ON(irqs_disabled());

which is due to this commit:

[ARM] Do not call flush_tlb_kernel_range() with IRQs disabled.

We must not call TLB maintainence operations with interrupts disabled,
otherwise we risk a lockup in the SMP IPI code.

This means that consistent_free() can not be called from a context with
IRQs disabled.  In addition, we must not hold the lock in consistent_free
when we call flush_tlb_kernel_range().  However, we must continue to
prevent consistent_alloc() from re-using the memory region until we've
finished tearing down the mapping and dealing with the TLB.

Therefore, leave the vm_region entry in the list, but mark it inactive
before dropping the lock and starting the tear-down process.  After the
mapping has been torn down, re-acquire the lock and remove the entry
from the list.

Signed-off-by: Russell King <[EMAIL PROTECTED]>

> Since it looks like that restriction won't be removed, this patch changes
> the definition of the API to include that requirement.

The PCI DMA-mapping API had this restriction.  For some reason, this
restriction was not carried forward into the DMA-API.  Unfortunately
the restriction can not be removed without causing the problems
described in the commit which introduced it.

Or alternatively we scrap ARM SMP entirely, which isn't going to happen.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-24 Thread Alan Cox

> That cannot be a justification for breaking serial port probe that has 
> been working for 10+ years.

Agree. With my "nearest thing we have to a serial maintainer" hat on
please revert this Andrew. Bjorn - lets discuss putting the right APIs in
place so you can busy out serial ports from other drivers when they are a
shared resource.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is PIE randomization breaking klibc binaries?

2007-07-24 Thread Ulrich Kunitz

On 07-07-24 16:57 Chuck Ebbert wrote:

> > $ strace ./cat
> > execve("./cat", ["./cat"], [/* 55 vars */]) = -1 ENOENT (No such file or 
> > directory)
> > ...

Chuck, my binaries run always into a segmentation violation. So
ENOENT is not the issue. (Notify it was on an x86-64.)

> > $ file cat
> > cat: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically 
> > linked (uses shared libs), stripped
> > 
> > Funny nobody noticed that before...
> > 
> 
> After installing klibc.so and klibc-.so into /lib everything works:
> 
> Program Headers:
>   Type   Offset   VirtAddr   PhysAddr   FileSiz  MemSiz   Flg Align
>   PHDR   0x34 0x08048034 0x08048034 0xa0 0xa0 R E 0x4
>   INTERP 0xd4 0x080480d4 0x080480d4 0x2a 0x2a R   0x1
> [Requesting program interpreter: 
> /lib/klibc-58kBUyV_qhVvkMnaxy8A7N8rLak.so]
> 

Yes, these files were present in the initrd.img file. I checked it
by unpacking the initrd.img file. Notify also that I used
git-bisect to identify the PIE patch. This requires successful
builds. Reverting the patch clearly resolved the issue at the end.

> Ulrich, did your initrd contain the correct .so?

Sure! I have only one klibc-*.so on my box in /lib. I diffed the
file in the unpacked initrd.img with the file in /lib and there
has been no difference.

I always recreate the initial ramdisk after the kernel rebuild
with make install and my own installkernel script, which uses
mkinitramfs. The mkinitramfs script ensures that the klibc so
object from /lib and the klibc binaries from /usr/lib/klibc/bin
are copied into the initrd image. Usually that works without any
issue on x86, x86-64. PPC can't use make install, but I use
mkinitramfs there too, which handles klibc the same way.

> Did you try rebuilding klibc after building the new kernel?

Rebuilding klibc doesn't make sense from my point of view. What
should be the point of it?

-- 
Uli Kunitz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: commit 7e92b4fc34 - x86, serial: convert legacy COM ports to platform devices - broke my serial console

2007-07-24 Thread Alan Cox

>   - use setserial to make the serial driver forget about ttyS2
> so an IR driver could claim it, or
> 
>   - use setserial to change the IRQ to 3 and just use the device
> in SIR mode, which is 16550-compatible so you can use the
> serial driver
> 
> I didn't express that very clearly in the changelog.

So the actual problem is quite different. Your IR driver for the port
should have an interface to tell the serial layer to make it unavailable.
End of problem and you can then use either service without setserial magic

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc1: known regressions with patches

2007-07-24 Thread Michal Piotrowski

On 24/07/07, Greg KH <[EMAIL PROTECTED]> wrote:

On Mon, Jul 23, 2007 at 11:47:44AM +0200, Michal Piotrowski wrote:
> Unclassified
>
> Subject : kobject link failure
> References  : http://lkml.org/lkml/2007/7/19/495
> Last known good : ?

This is caused by a patch that happened after 2.6.22 was released, so it
is a regression.

Yes, I know. "?" is a default value. When someone says that bug
appeared after 2.6.22-git7 I'm adding this information to "Last known
good".

> Submitter   : Jan Engelhardt <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : Cornelia Huck <[EMAIL PROTECTED]>
>   Greg Kroah-Hartman <[EMAIL PROTECTED]>
> Patch   : http://lkml.org/lkml/2007/7/20/143
> Status  : patch available

I'll be sending Cornelia's patch to Linus within the week to fix this.

thanks,

greg k-h

Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-24 Thread Trond Myklebust

On Wed, 2007-07-25 at 07:37 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-24 at 11:13 -0700, Linus Torvalds wrote:
> > 
> > IOW, if you do a spinlock with the bitops, the locking side should be
> > able 
> > to use a "test_and_set_bit()" on its own, but the unlocking side
> > should be
> > 
> > smp_mb__before_clear_bit();
> > clear_bit();
> > 
> > because the ones that don't return a value also don't imply a memory 
> > barrier.
> 
> Yup. But I much prefer Nick's clear_bit_unlock() :-)
> 
> Ben

If you want to use bitops as spinlocks you should rather be using
. That also does the right thing w.r.t.
pre-emption and sparse locking annotations.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: console UTF-8 fixes

2007-07-24 Thread H. Peter Anvin

Samuel Thibault wrote:
> Hi,
> 
> Egmont got some UTF-8 fixes in mainline, Andrew Morton suggested it
> might be a good time to remember about bug 7746 Support for unicode dead
> keys: http://bugzilla.kernel.org/show_bug.cgi?id=7746 :
> 
> « Quoting a mail from Vojtech Pavlik:
> 
> "Several languages (polish, czech, slovak, ...) use dead keys (keys that
> don't do anything per se, but put an accent on the next letter). And
> now almost everyone is switching to unicode. And Linux kernel doesn't
> support unicode for dead keys. This means trouble."
> (full mail at
> http://www.ussg.iu.edu/hypermail/linux/kernel/0405.3/1387.html)
> 
> And indeed, see http://bugs.debian.org/404503
> 
> There is a more recent patch proposed on
> http://www.ussg.iu.edu/hypermail/linux/kernel/0503.2/1723.html
> 
> Is there any objection to the proposed way? (extending the internal
> type, and add a new ioctl for uploading unicode dead keys). »
> 

Makes sense to me.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-24 Thread Chris Snook


Chris Friesen wrote:

Chris Snook wrote:

I don't think Chris's scenario has much bearing on your patch.  What 
he wants is to have a task that will always be running, but can't 
monopolize either CPU. This is useful for certain realtime workloads, 
but as I've said before, realtime requires explicit resource 
allocation.  I don't think this is very relevant to SCHED_FAIR balancing.


I'm not actually using the scenario I described, its just sort of a 
worst-case load-balancing thought experiment.


What we want to be able to do is to specify a fraction of each cpu for 
each task group.  We don't want to have to affine tasks to particular cpus.


A fraction of *each* CPU, or a fraction of *total* CPU?  Per-cpu granularity 
doesn't make anything more fair.  You've got a big bucket of MIPS you want to 
divide between certain groups, but it shouldn't make a difference which CPUs 
those MIPS come from, other than the fact that we try to minimize overhead 
induced by migration.


This means that the load balancer must be group-aware, and must trigger 
a re-balance (possibly just for a particular group) as soon as the cpu 
allocation for that group is used up on a particular cpu.


If I have two threads with the same priority, and two CPUs, the scheduler will 
put one on each CPU, and they'll run happily without any migration or balancing. 
 It sounds like you're saying that every X milliseconds, you want both to 
expire, be forbidden from running on the current CPU for the next X 
milliseconds, and then migrated to the other CPU.  There's no gain in fairness 
here, and there's a big drop in performance.


I suggested local fairness as a means to achieve global fairness because it 
could reduce overhead, and by adding the margin of error at each level in the 
locality hierarchy, you can get an algorithm which naturally tolerates the level 
of unfairness beyond which it is impossible to optimize.  Strict local fairness 
for its own sake doesn't accomplish anything that's better than global fairness.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: console UTF-8 fixes

2007-07-24 Thread Samuel Thibault

Hi,

Egmont got some UTF-8 fixes in mainline, Andrew Morton suggested it
might be a good time to remember about bug 7746 Support for unicode dead
keys: http://bugzilla.kernel.org/show_bug.cgi?id=7746 :

« Quoting a mail from Vojtech Pavlik:

"Several languages (polish, czech, slovak, ...) use dead keys (keys that
don't do anything per se, but put an accent on the next letter). And
now almost everyone is switching to unicode. And Linux kernel doesn't
support unicode for dead keys. This means trouble."
(full mail at
http://www.ussg.iu.edu/hypermail/linux/kernel/0405.3/1387.html)

And indeed, see http://bugs.debian.org/404503

There is a more recent patch proposed on
http://www.ussg.iu.edu/hypermail/linux/kernel/0503.2/1723.html

Is there any objection to the proposed way? (extending the internal
type, and add a new ioctl for uploading unicode dead keys). »

Samuel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered unnecessarily

2007-07-24 Thread Linus Torvalds



On Tue, 24 Jul 2007, Jeremy Fitzhardinge wrote:
> >
> > But gcc docs also talk about the other things volatile means, including 
> > "not significantly moved".
> 
> Actually, it doesn't.  In fact it goes out of its way to say that "asm
> volatile" statements can be moved quite a bit, with respect to other
> asms, other code, jumps, basic blocks, etc.

Ahh. That's newer.

Historically, gcc manuals used to say "may not be deleted or significantly 
reordered".

So they've weakened what it means, probably exactly because it wasn't 
well-defined before either.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1032 matches

Mail list logo