date:20070529

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Mark Adler


On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:

skipping some checksum calculation if some
flag (PRESET_DICT) is absent from the input stream about to
be decompressed ...


You don't need to dissect the header manually to look for that bit.   
If you feed inflate() at least the first two bytes, it will return  
immediately with the Z_NEED_DICT return code if a preset dictionary  
is requested.  You can force inflate() to return immediately after  
decoding the two byte header even if a preset dictionary is not  
requested by using the Z_BLOCK flush code.


Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: tty-related oops in latest kernel(s)?

2007-05-29 Thread Pekka Enberg


On 5/30/07, Tero Roponen <[EMAIL PROTECTED]> wrote:

Hmmm, I just found something interesting. In 2.6.21.3 the /sbin/init
gets corrupted when I watch the video!

$ cp /sbin/init init.before
$ mplayer kiwi.flv
$ cp /sbin/init init.after

The sha1sums are here:

52c8d643057619cbe137b8e69d4709ce3bdd832d  init.after
8efc7864a5b535a9e336fa82e9d7f112f3d956c1  init.before

It seems that something corrupts memory somewhere...


To debug this a bit further:

$ od -a -t x1 -v init.after > init.after.dump
$ od -a -t x1 -v init.before > init.before.dump
$ diff -u init.before.dump init.after.dump | less

-0011340  nul nul nul  e9  f0  fe  ff  ff  ff   %   < soh enq  bs   h  80
-   00  00  00  e9  f0  fe  ff  ff  ff  25  3c  01  05  08  68  80
+001y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
+   79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
+0010020y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
+   79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
+0011340y ack nul nul   y ack nul nul  ff   %   < soh enq  bs   h  80
+   79  06  00  00  79  06  00  00  ff  25  3c  01  05  08  68  80

The file at offset 001 - 0011348 is overwritten with the byte
pattern 79 06 00 00.

Do you see anything in the logs or is this a silent corruption? Did
you see this corruption with 2.6.19 or 2.6.22-rc3?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta

On 5/30/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:

On May 28 2007 20:04, Nitin Gupta wrote:
>
> * Changelog vs. original LZO:
> 1) Used standard/kernel defined data types: (this eliminated _huge_
> #ifdef chunks)
> lzo_bytep -> unsigned char *
> lzo_uint -> size_t
> lzo_xint -> size_t

Is this safe (as far as compressed LZO stream is concerned) --
or is it even needed (could it be unsigned int)?

> - m_pos -= (*(const unsigned short *)ip) >> 2;
> -#else
> - m_pos = op - 1;
> - m_pos -= (ip[0] >> 2) + (ip[1] << 6);
> -#endif
>
> + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);
>
> (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
> *** Need testing on big endian machine ***

On i386, both cpu_to_le16 and le16_to_cpu do nothing.
On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-)
they swap 1234<->4321.

It is the bytestream (ip) that is reinterpreted as uint16_t.
And I really doubt that the LZO author has a big-endian machine,
given the days of ubiquitous x86.

le16_to_cpu it is.

I just missed the point that leXX_to_cpu() and cpu_to_leXX() are
identical on BE machine anyway. But then why you think it should be
le_16_cpu() -- how will this make any difference?

For your ref (from big_endian.h):
#define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))

- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread Crispin Cowan

[EMAIL PROTECTED] wrote:
> On Mon, 28 May 2007 21:54:46 EDT, Kyle Moffett said:
>   
>> Average users are not supposed to be writing security policy.  To be  
>> honest, even average-level system administrators should not be  
>> writing security policy.
That explains so much! "SELinux: you're too dumb to use it, so just keep
your hands in your pockets." :-)

AppArmor was designed to allow your average sys admin to write a
security policy. It makes different design choices than SELinux to
achieve that goal. As a result, AppArmor is an utter failure when
compared to SELinux's goals, and SELinux in turn is an utter failure
when compared to AppArmor's goals.

Which is why we have LSM: so we don't have to have this argument here,
again.

>>   It's OK for such sysadmins to tweak  
>> existing policy to give access to additional web-docs or such, but  
>> only expert sysadmin/developers or security professionals should be  
>> writing security policy.  It's just too damn easy to get completely  
>> wrong.
>> 
> The single biggest challenge in computer security at the present time is how 
> to
> build *and deploy* servers that stay reasonably secure even when run by the
> average wave-a-dead-chicken sysadmin, and desktop-class boxes that can survive
> the best attempts of Joe Sixpack's "Ooh shiny" reflex, and Joe's kid's 
> attempts
> to evade the nannyware that Joe had somebody install.
>   
That is a tall order. You can mostly achieve it by not giving the user
the root password, but I'm not sure you would like the result :-)

Both SELinux and AppArmor can be configured so tightly that you are not
going to get to install malware, by preventing the user from installing
software. This isn't what users want, so they invariably bypass security
and install shiny things if they own the box. SELinux and AppArmor can't
help but fail if you put them in that kind of harm's way.

Crispin

-- 
Crispin Cowan, Ph.D.   http://crispincowan.com/~crispin/
Director of Software Engineering   http://novell.com
   Security: It's not linear

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] m68k: runtime patching infrastructure

2007-05-29 Thread Eric Dumazet


Andrew Morton a écrit :

On Mon, 28 May 2007 21:16:31 +0200
Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:


--- a/include/asm-m68k/module.h
+++ b/include/asm-m68k/module.h
@@ -1,7 +1,38 @@
 #ifndef _ASM_M68K_MODULE_H
 #define _ASM_M68K_MODULE_H
-struct mod_arch_specific { };
+
+struct mod_arch_specific {
+   struct m68k_fixup_info *fixup_start, *fixup_end;
+};


Here we use struct m68k_fixup_info.


+#define MODULE_ARCH_INIT { \
+   .fixup_start= __start_fixup,\
+   .fixup_end  = __stop_fixup, \
+}
+
 #define Elf_Shdr Elf32_Shdr
 #define Elf_Sym Elf32_Sym
 #define Elf_Ehdr Elf32_Ehdr
+
+
+enum m68k_fixup_type {
+   m68k_fixup_memoffset,
+};
+
+struct m68k_fixup_info {
+   enum m68k_fixup_type type;
+   void *addr;
+};


and later we define it.

How come it doesn't spit warnings?

I think it could be tightened up even if it happens not to warn?



struct a {
struct not_yet_defined *start, *end;
};

struct not_yet_defined {
void *foo;
};

Is a valid and gives no warnings.

Still I didnt tried to compile a m68k kernel, so I guess I shouldnt speak here 
:)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread Crispin Cowan

Pavel Machek wrote:
>> * Hard links: AppArmor explicitly mediates permission to make a hard
>> 
> Unfortunately, aparmor is by design limited to subset of distro
> (network daemons).
That is not true. AppArmor is designed to confine any application you do
not want to trust completely. This includes network daemons (Apache,
Sendmail, BIND, etc.) network clients (Firefox, Thunderbird,
GAIM/Pidgin, etc.) and any other application that mediates trust: where
data on one side of the application is higher privilege than potential
attackers on the other side of the application.

>  Unfortunately, some other programs (passwd, vi)
> routinely make hardlinks. So AA mediating hardlink is not enough, as
> vi will happily hardlink /etc/shadow into /etc/.vi-shadow-1234.
>   
I have no idea what you are talking about. I routinely profile vim on
stage in AppArmor presentations, and it presents no such problems.

Caveat: the circumstances under which you would actually want to profile
vi are rather limited, most users would not want to do that.

In another post Pavel Machek wrote:
> Hmm, I guess I'd love "it is useless on multiuser boxes" to become
> standard part of AA advertising.
That is usually around slide 7 of the standard AppArmor presentation,
right next to the remarks about how mulituser boxes are nearly extinct
dinosaurs. AppArmor gets some of its simplicity and ease of use by not
considering that vanishing use case. Even so, AppArmor does have uses on
multiuser boxes, just not the uses Pavel wishes for. Fine, use a
different tool for a different task, AppArmor has plenty of use cases.

The limitation Pavel is referring to is that AppArmor does not secure
processes that are not profiled by AppArmor. We know, this is
intentional, and contributes to AppArmor's ease of use, and does not
generate a hole if you consider every process exposed to (say) network
attack and confine it.

Crispin

-- 
Crispin Cowan, Ph.D.   http://crispincowan.com/~crispin/
Director of Software Engineering   http://novell.com
   Security: It's not linear

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.21.3] ieee1394: eth1394: bring back a parent device

2007-05-29 Thread Stefan Richter

Date: Mon, 21 May 2007 01:05:41 +0200
From: Stefan Richter <[EMAIL PROTECTED]>
Subject: ieee1394: eth1394: bring back a parent device

This adds a real parent device to eth1394's ethX device like in Linux
2.6.20 and older.  However, due to unfinished conversion of the ieee1394
away from class_device, we now refer to the FireWire controller's PCI
device as the parent, not to the ieee1394 driver's fw-host device.

Having a real parent device instead of a virtual one allows udev scripts
to distinguish eth1394 interfaces from networking bridges, bondings and
the likes.

Fixes a regression since 2.6.21:
https://bugs.gentoo.org/show_bug.cgi?id=177199

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
---
Same as commit ef50a6c59dc66f22eba67704e291d709f21e0456.

 drivers/ieee1394/eth1394.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

Index: linux-2.6.21.3/drivers/ieee1394/eth1394.c
===
--- linux-2.6.21.3.orig/drivers/ieee1394/eth1394.c
+++ linux-2.6.21.3/drivers/ieee1394/eth1394.c
@@ -584,10 +584,9 @@ static void ether1394_add_host (struct h
 }
 
SET_MODULE_OWNER(dev);
-#if 0
-   /* FIXME - Is this the correct parent device anyway? */
-   SET_NETDEV_DEV(dev, >device);
-#endif
+
+   /* This used to be >device in Linux 2.6.20 and before. */
+   SET_NETDEV_DEV(dev, host->device.parent);
 
priv = netdev_priv(dev);
 

-- 
Stefan Richter
-=-=-=== -=-= -
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch -mm 1/1] remove useless tolower in isofs

2007-05-29 Thread young dave


Hi,


Your email client replaces tabs with spaces.


The  tabs replacing was caused by copying them from vi session in
gnome-terminal. I find the proper way is to copy them from some gui
editor.

Regards
dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta

On 5/30/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote:
>
> On May 28 2007 19:11, Adrian Bunk wrote:
You completely miss the point of my question.

It's about the performance improvements of the modified code that were
mentioned.

What you are talking about shouldn't have any effect on the generated
code.

After Daniel refined this testing program, we can see that perf gain
is < 2% which can surely be accounted to cleanups like unnecessary
castings in tight loops.

Again, all the original code has been retained _as-is_. Whatever was
changed, has been mentioned in that detailed changelog that I post
along with patch.

If someone want to review these 500 lines with this changelog in hand,
it should not take more than couple of hours. If no-one can see any
problem in code by now and considering that it's tested on x86(_32),
amd64, ppc and giving somewhat better perf. than original then I
believe it is unnecessarily hanging outside of -mm tree.

I also contacted author (Markus Oberhumer) regarding above changes but
he seems not be responding right now. But still, if it gets into -mm
and gets used by various related projects then it should be good
enough for mainline also.

Cheers,
Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/30/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I
decided to compare version 6 of the code against the *unsafe* decompressor
again. The results of the three runs I've put it through after changing it to
compare against the unsafe decompressor were startling. `Tiny's` compressor
is still faster - I've seen it be rated up to 3% faster. The decompressor,
OTOH, when compared to the unsafe version (which is the comparison that
started me on this binge of hacking), is more than 7% worse. About 11% slower
on the original test against a C source file, and about 6% slower for random
data.


Unsafe vs safe is within 10%. Its okay.


However, looking at the numbers involved, I can't see a reason to keep
the unsafe version around - the percentages look worse than they are - from 1
to 3 microseconds.


Not just numbers. Most of applications cannot afford to use unsafe
versions anyway (like fs people).

(well, the compressed-cache people might want those extra

usecs - but the difference will never be noticeable anywhere outside the
kernel)

DRH



compressed cache people require every single percent of that
performance. For now, ccaching is not ready for mainline (many things
need to be done). So, till then I will keep off the unsafe version. If
ever compressed caching is on its way to mainline _then_ I will try
and add back the unsafe version. But I see no other project that
really cares about unsafe version so it's okay to keep it off.


- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] tweak make htmldocs (nochunks and better index).

2007-05-29 Thread Randy Dunlap

On Tue, 29 May 2007 15:28:33 -0400 Rob Landley wrote:

> Signed-off-by: Rob Landley <[EMAIL PROTECTED]>
> 
> A) The "xmtml-nochunks" version of "make htmldocs" is easier to print.

and makes it darn near impossible to print one page or function
if that's all that someone is interested in.

I'd prefer to leave the html output in separate files and use
pdf or ps output for printing entire files.  But that's just my take on it.


> B) Update the generated index.html to use the html  as a
>description for each file it links to.
> 
> --
> 
> --- git/Documentation/DocBook/Makefile2007-05-23 16:36:56.0 
> -0400
> +++ work/Documentation/DocBook/Makefile   2007-05-26 23:11:36.0 
> -0400
> @@ -141,9 +141,12 @@
>  cat $(HTML) >> $(main_idx)
>  
>  quiet_cmd_db2html = HTML   $@
> -  cmd_db2html = xmlto xhtml $(XMLTOFLAGS) -o $(patsubst %.html,%,$@) $< 
> && \
> - echo ' \
> -$(patsubst %.html,%,$(notdir $@))' > $@
> +  cmd_db2html = xmlto xhtml-nochunks $(XMLTOFLAGS) -o $(patsubst 
> %.html,%,$@) $< && \
> + NAME='$(patsubst %.html,%,$(notdir $@))'; \
> + echo -n "$$NAME" > $@ ; \
> + sed -nre '[EMAIL PROTECTED](.*)[EMAIL 
> PROTECTED]@p' \
> + "Documentation/DocBook/$$NAME/$$NAME.html" >> $@ ; \
> + echo '<\p>' >> $@
>  %.html:  %.xml
>   @(which xmlto > /dev/null 2>&1) || \

(repeating due to email bounced/rejected when from @oracle.com:)

What is <\p> supposed to do?

Here's what I see in index.html (beginning lines of it):

Linux Kernel HTML Documentation
Kernel Version: 2.6.22-rc3

deviceiobookBus-Independent Device Accesses <\p>

filesystemsLinux Filesystems API <\p>

gadgetUSB Gadget API for Linux <\p>

genericirqLinux generic IRQ handling <\p>

kernel-apiThe Linux Kernel API <\p>

kernel-hackingUnreliable Guide To Hacking The Linux Kernel <\p>

kernel-lockingUnreliable Guide To Locking <\p>

---
so it looks like it could use a space after the short name/before
the long name/title, and fix the <\p>.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-29 Thread Linus Torvalds

On Tue, 29 May 2007, Robert Hancock wrote:
>
> This path adds validation of the MMCONFIG table against the ACPI reserved
> motherboard resources.

Please fix the formatting of your code.

"for" and "if" are not functions, and they have a space before the 
parenthesis.

And pretty much every single conditional in this patch is spread out over 
two or more lines and has at least three different indentations. There's 
something wrong here. Code can't look this bad and still be fine. Some of 
this looks like random whitespace noise:

+   if(is_acpi_reserved(cfg->address,
+   cfg->address + size - 1))
+   printk(KERN_NOTICE "PCI: MCFG area at %Lx reserved "
+   "in ACPI motherboard resources\n",
+   cfg->address);
+   else {

That's just horrid. Please try to make the code _look_ nicer.

For example, just making "is_acpi_reserved()" take a start/len thing 
instead, would allow you to at least do

if (is_acpi_reserved(cfg->address, size)) {
printk(KERN_NOTICE "PCI: MCFG area at %Lx reserved "
"in ACPI motherboard resources\n",
cfg->address);
} else {
...

(and has the braces rigth too - don't pair an unbraced "if ()" with a 
braced "else" statement), and that together with making the body of the 
for-loop a separate function would possibly make that code read a lot 
better.

Same goes for this thing:

+   if((pci_probe & PCI_PROBE_CONF1) &&
+  e820_all_mapped(cfg->address,
+  cfg->address + size - 1,
+  E820_RESERVED))
+   printk(KERN_NOTICE "PCI: MCFG area at %Lx 
reserved in E820\n",
+   cfg->address);
+   else
+   goto reject;

there really is *not* a highly coveted prize for having the most different 
indentation in the fewest possible lines of code!

Yeah, I realize that maybe this is nit-picking, but trying to read this 
patch really does make you go blind. It violates so many coding standards 
that it's almost impossible to read the code itself. It's made worse by 
the fact that you then also used Thunderbird to send the patch, and had it 
set for

Content-type: text/plain; charset=ISO-8859-1; format=flowed

where that "format=flowed" means that basically no mail-client will be 
able to read it sanely (because a lot of them will re-flow the text), and 
you have to save it to a file before you can even comment on it.

Gaah. See

http://mbligh.org/linuxdocs/Email/Clients/Thunderbird

where the most important sentence is "Thunderbird is written by aged whore 
monkeys stoned on crack". But it also talks about how to disable that 
idiotic format=flowed for patches, and how to make sure it's not wrapping.

But as mentioned, your patch itself had some whitespace issues even aside 
from the regular Thunderbird breakage.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v12

2007-05-29 Thread Siddha, Suresh B

On Tue, May 29, 2007 at 07:18:18PM -0700, Peter Williams wrote:
> Siddha, Suresh B wrote:
> > I can try 32-bit kernel to check.
> 
> Don't bother.  I just checked 2.6.22-rc3 and the problem is not present
> which means something between rc2 and rc3 has fixed the problem.  I hate
> it when problems (appear to) fix themselves as it usually means they're
> just hiding.
> 
> I didn't see any patches between rc2 and rc3 that were likely to have
> fixed this (but doesn't mean there wasn't one).  I'm wondering whether I
> should do a git bisect to see if I can find where it got fixed?
> 
> Could you see if you can reproduce it on 2.6.22-rc2?

No. Just tried 2.6.22-rc2 64-bit version at runlevel 3 on my remote
system at office. 15 attempts didn't show the issue.

Sure that nothing changed in your test setup?

More experiments tomorrow morning..

thanks,
suresh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: sleeping function called from invalid context at kernel/fork.c:385

2007-05-29 Thread David Chinner

On Wed, May 30, 2007 at 01:30:41PM +1000, David Chinner wrote:
> On Wed, May 23, 2007 at 10:44:16AM -0700, Luck, Tony wrote:
> > > > Saw this when running strace -f on a script on 2.6.21 on ia64:
> > > > 
> > > > BUG: sleeping function called from invalid context at kernel/fork.c:385
> > > > in_atomic():1, irqs_disabled():0
> > ... snip ...
> > > > I could reproduce it via 'strace -f sleep 1'
> > > > 
> > >
> > > I'd say this is specific to ia64.   Someone would have spotted it on
> > > x86 by now.
> > 
> > I tried the "strace -f sleep 1" on 2.6.22-rc2, and I didn't see this "BUG"
> > there.  Can you try your other test cases on the latest kernel.  If it has
> > already been fixed we can see about identifying the patch for possible 
> > backport
> > to 2.6.21.stable
> 
> Sorry for taking so long to get back to this - I still see this in
> 2.6.22-rc2.

Hmmm - I wonder if my tree is screwed up in some weird way. I'm seeing link
warnings as well:

  AS  .tmp_kallsyms2.o
  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
WARNING: init/built-in.o - Section mismatch: reference to .init.data: from 
.sdata after 'root_mountflags' (at offset 0x38)
WARNING: init/built-in.o - Section mismatch: reference to .init.data:ino from 
.sdata after 'root_mountflags' (at offset 0x40)
WARNING: arch/ia64/kernel/built-in.o - Section mismatch: reference to 
.init.data:smp_boot_data from .sdata before 'acpi_irq_model' (at offset -0x0)
WARNING: arch/ia64/kernel/built-in.o - Section mismatch: reference to 
.init.data:rsvd_region from .sdata between 'ia64_sal' (at offset 0x118) and 
'ia64_i_cache_stride_shift'
WARNING: arch/ia64/kernel/built-in.o - Section mismatch: reference to 
.init.data:smp_boot_data from .sdata between 'cpu.25267' (at offset0x2e8) and 
'sal_state_for_booting_cpu'
WARNING: arch/ia64/mm/built-in.o - Section mismatch: reference to .init.data: 
from .sdata between 'hpage_shift' (at offset 0x50) and 'first_time.25897'
WARNING: arch/ia64/mm/built-in.o - Section mismatch: reference to .init.data: 
from .sdata between 'hpage_shift' (at offset 0x68) and 'first_time.25897'
WARNING: arch/ia64/mm/built-in.o - Section mismatch: reference to .init.data: 
from .sdata between 'hpage_shift' (at offset 0x70) and 'first_time.25897'
WARNING: arch/ia64/mm/built-in.o - Section mismatch: reference to .init.data: 
from .sdata between 'hpage_shift' (at offset 0x88) and 'first_time.25897'
WARNING: arch/ia64/mm/built-in.o - Section mismatch: reference to .init.data: 
from .sdata between 'hpage_shift' (at offset 0x90) and 'first_time.25897'
.

There's more errors, but they are all section mismatch warnings.
I tried a make mrproper to see if that would fix but it didn't

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [1/4] 2.6.22-rc3: known regressions

2007-05-29 Thread Sam Ravnborg

On Tue, May 29, 2007 at 02:52:53PM +0200, Michal Piotrowski wrote:
> Hi all,
> 
> Here is a list of some known regressions in 2.6.22-rc3.
> 
> 
> Kbuild
> 
> Subject: make M=$PWD modules_install does nothing
> References : http://lkml.org/lkml/2007/5/27/190
> Submitter  : Andrey Borzenkov <[EMAIL PROTECTED]>
> Status : Unknown
Closed - see http://lkml.org/lkml/2007/5/29/497

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] Add /proc/sys/vm/compact_node for the explicit compaction of a node

2007-05-29 Thread Christoph Lameter

On Tue, 29 May 2007, Mel Gorman wrote:

> + if (nodeid < 0)
> + return -EINVAL;
> +
> + pgdat = NODE_DATA(nodeid);
> + if (!pgdat || pgdat->node_id != nodeid)
> + return -EINVAL;

You cannot pass an arbitrary number to node data since NODE_DATA may do a 
simple array lookup.

Check for node < nr_node_ids first.

pgdat->node_id != nodeid? Sounds like something you should BUG() on.

IA64's NODE_DATA is

struct ia64_node_data {
short   active_cpu_count;
short   node;
struct pglist_data  *pg_data_ptrs[MAX_NUMNODES];
};

/*
 * Given a node id, return a pointer to the pg_data_t for the node.
 *
 * NODE_DATA- should be used in all code not related to system
 *initialization. It uses pernode data structures to minimize
 *offnode memory references. However, these structure are not
 *present during boot. This macro can be used once cpu_init
 *completes.
 */
#define NODE_DATA(nid)  (local_node_data->pg_data_ptrs[nid])

x86_64 also does

#define NODE_DATA(nid)  (node_data[nid])

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] tweak make htmldocs (nochunks and better index).

2007-05-29 Thread Randy Dunlap

On Tue, 29 May 2007 15:28:33 -0400 Rob Landley wrote:

> Signed-off-by: Rob Landley <[EMAIL PROTECTED]>
> 
> A) The "xmtml-nochunks" version of "make htmldocs" is easier to print.
> 
> B) Update the generated index.html to use the html  as a
>description for each file it links to.
> 
> --
> 
> --- git/Documentation/DocBook/Makefile2007-05-23 16:36:56.0 
> -0400
> +++ work/Documentation/DocBook/Makefile   2007-05-26 23:11:36.0 
> -0400
> @@ -141,9 +141,12 @@
>  cat $(HTML) >> $(main_idx)
>  
>  quiet_cmd_db2html = HTML   $@
> -  cmd_db2html = xmlto xhtml $(XMLTOFLAGS) -o $(patsubst %.html,%,$@) $< 
> && \
> - echo ' \
> -$(patsubst %.html,%,$(notdir $@))' > $@
> +  cmd_db2html = xmlto xhtml-nochunks $(XMLTOFLAGS) -o $(patsubst 
> %.html,%,$@) $< && \
> + NAME='$(patsubst %.html,%,$(notdir $@))'; \
> + echo -n "$$NAME" > $@ ; \
> + sed -nre '[EMAIL PROTECTED](.*)[EMAIL 
> PROTECTED]@p' \
> + "Documentation/DocBook/$$NAME/$$NAME.html" >> $@ ; \
> + echo '<\p>' >> $@
>  
>  %.html:  %.xml
>   @(which xmlto > /dev/null 2>&1) || \
> -

What is <\p> supposed to do?

Here's what I see in index.html (beginning lines of it):

Linux Kernel HTML Documentation
Kernel Version: 2.6.22-rc3

deviceiobookBus-Independent Device Accesses <\p>

filesystemsLinux Filesystems API <\p>

gadgetUSB Gadget API for Linux <\p>

genericirqLinux generic IRQ handling <\p>

kernel-apiThe Linux Kernel API <\p>

kernel-hackingUnreliable Guide To Hacking The Linux Kernel <\p>

kernel-lockingUnreliable Guide To Locking <\p> 



---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Jiffies wraparound is not treated in the schedstats

2007-05-29 Thread Oleg Verych

On 2006-11-09, Balbir Singh <[EMAIL PROTECTED]> wrote:
> Path: news.gmane.org!not-for-mail
> From: Balbir Singh <[EMAIL PROTECTED]>
> Newsgroups: gmane.linux.kernel
> Subject: Re: Jiffies wraparound is not treated in the schedstats
> Date: Thu, 09 Nov 2006 11:59:45 +0530
> Organization: IBM
> Lines: 61
> Approved: [EMAIL PROTECTED]
> Message-ID: <[EMAIL PROTECTED]>
> References: <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> NNTP-Posting-Host: main.gmane.org
> Mime-Version: 1.0
> Content-Type: text/plain; charset=ISO-8859-1
> Content-Transfer-Encoding: 7bit
> X-Trace: sea.gmane.org 1163053871 25652 80.91.229.2 (9 Nov 2006 06:31:11 GMT)
> X-Complaints-To: [EMAIL PROTECTED]
> NNTP-Posting-Date: Thu, 9 Nov 2006 06:31:11 + (UTC)
> Cc: linux-kernel 
> Original-X-From: [EMAIL PROTECTED] Thu Nov 09 07:31:09 2006
> Return-path: <[EMAIL PROTECTED]>
> Envelope-to: [EMAIL PROTECTED]
> Original-Received: from vger.kernel.org ([209.132.176.167]) by ciao.gmane.org 
> with esmtp (Exim 4.43) id 1Gi3RH-0005lG-4D for [EMAIL PROTECTED]; Thu, 09 Nov 
> 2006 07:31:03 +0100
> Original-Received: ([EMAIL PROTECTED]) by vger.kernel.org via listexpand id 
> S1754746AbWKIGad (ORCPT ); Thu, 9 Nov 2006 01:30:33 
> -0500
> Original-Received: ([EMAIL PROTECTED]) by vger.kernel.org id S1754747AbWKIGad 
> (ORCPT ); Thu, 9 Nov 2006 01:30:33 -0500
> Original-Received: from ausmtp05.au.ibm.com ([202.81.18.154]:27366 "EHLO 
> ausmtp05.au.ibm.com") by vger.kernel.org with ESMTP id S1754746AbWKIGac 
> (ORCPT ); Thu, 9 Nov 2006 01:30:32 -0500
> Original-Received: from sd0208e0.au.ibm.com (d23rh904.au.ibm.com 
> [202.81.18.202]) by ausmtp05.au.ibm.com (8.13.8/8.13.6) with ESMTP id 
> kA9IWR8m2142218 for ; Thu, 9 Nov 2006 17:32:29 
> -0100
> Original-Received: from d23av01.au.ibm.com (d23av01.au.ibm.com 
> [9.190.250.242]) by sd0208e0.au.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP 
> id kA96XQUu233974 for ; Thu, 9 Nov 2006 
> 17:33:46 +1100
> Original-Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by 
> d23av01.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id kA96TxcF020262 for 
> ; Thu, 9 Nov 2006 17:29:59 +1100
> Original-Received: from [9.124.96.199] ([9.124.96.199]) by d23av01.au.ibm.com 
> (8.12.11.20060308/8.12.11) with ESMTP id kA96Tuua020086; Thu, 9 Nov 2006 
> 17:29:58 +1100
> User-Agent: Thunderbird 1.5.0.7 (X11/20060922)
> Original-To: Mauricio Lin <[EMAIL PROTECTED]>
> In-Reply-To: <[EMAIL PROTECTED]>
> Original-Sender: [EMAIL PROTECTED]
> Precedence: bulk
> X-Mailing-List: linux-kernel@vger.kernel.org
> Xref: news.gmane.org gmane.linux.kernel:465046
> Archived-At: 
>
> Mauricio Lin wrote:
>> Hi Balbir,
>> 
>> Do you know why in the sched_info_arrive() and sched_info_depart()
>> functions the calculation of delta_jiffies does not use the time_after
>> or time_before macro to prevent  the miscalculation when jiffies
>> overflow?
>> 
>> For instance the delta_jiffies variable is simply calculated as:
>> 
>> delta_jiffies = now - t->sched_info.last_queued;
>> 
>> Do not you think the more logical way should be
>> 
>> if (time_after(now, t->sched_info.last_queued))
>>delta_jiffies = now - t->sched_info.last_queued;
>> else
>>delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now
>> 
>
> What's MAX_JIFFIES? Is it MAX_ULONG? jiffies is unsigned long
> so you'll have to be careful with unsigned long arithmetic.
>
> Consider that now is 5 and t->sched_info.last_queued is 10.
>
> On my system
>
> perl -e '{printf("%lu\n", -5 + (1<<32) - 1);}'
> 4294967291

So, according to this
> perl -e '{printf("%lu\n", -5 );}'
> 4294967291
you have 32bit perl (and OS).

I have same result in
"This is perl, v5.8.4 built for i386-linux-thread-multi"

(knoppix 3.2.3 in qemu on intel's amd64).

That means we are on the same question i asked before about integer
overflow. In this case "(1<<32) = 1",  

>> I have included more variables to measure some issues of schedule in
>> the kernel (following schedstat idea) and I noticed that jiffies
>> wraparound has led to wrong values, since the user space tool when
>> collecting the values is producing negative values.
>> 
>
> hmm.. jiffies wrapped around in sched_info_depart()? I've never seen
> that happen. Could you post the additions and user space tool to look at?
> What additional features are you planning to measure in the scheduler?
>
>> Any comments?
>> 
>> Can I provide a patch for that?
>> 
>
> Please feel free to provide patches, this is open source!!
>
>> BR,
>> 
>> Mauricio Lin.
>
>
> -- 
>
>   Balbir Singh,
>   Linux Technology Center,
>   IBM Software Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ckrm-tech] [RFC] [PATCH 0/3] Add group fairness to CFS

2007-05-29 Thread Peter Williams


William Lee Irwin III wrote:

On Wed, May 30, 2007 at 10:09:28AM +1000, Peter Williams wrote:
So what you're saying is that you think dynamic priority (or its 
equivalent) should be used for load balancing instead of static priority?


It doesn't do much in other schemes, but when fairness is directly
measured by the dynamic priority, it is a priori more meaningful.
This is not to say the net effect of using it is so different.


I suspect that while it's probably theoretically better it wouldn't make 
much difference on a real system (probably not enough to justify any 
extra complexity if there were any).  The exception might be on systems 
where there were lots of CPU intensive tasks that used relatively large 
chunks of CPU each time they were runnable which would give the load 
balancer a more stable load to try and balance.  It might be worth the 
extra effort to get it exactly right on those systems.  On most normal 
systems this isn't the case and the load balancer is always playing 
catch up to a constantly changing scenario.


Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-29 Thread Robert Hancock


Justin Piszcz wrote:

Short Description of Problem:
Linux 2.6.21.3 does not run properly with 8GB of ram on the Intel 965WH 
motherboard.


Long Description of Problem:
When I use 8GB of memory on my x86_64 system, CPU-bound processes are VERY
slow, up to 36x slower than usual.  My temporary fix is force Linux to only
use 4GB of memory, I am currently using mem=4096M.  I ran memtest86 and the
memory is fine, not a single error.  I tried the following to mem= 1024, 
2048

4096 and blank "" to let the kernel use all 8GB of memory.  What is wrong
with the kernel and how come it cannot use 8GB of memory without slowing 
down
all CPU-related processes to a snail-like pace?  There is something 
horribly

wrong here.

Specifications:
Intel Motherboard: 965WH
Linux Kernel: 2.6.21.3
Distribution: Debian Testing x86_64
GCC: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
Target: x86_64-linux-gnu

Tests:

1. append line = 1024M
top - 18:28:26 up 1 min,  4 users,  load average: 0.42, 0.17, 0.06
Tasks: 157 total,   1 running, 156 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   1027016k total,   964288k used,62728k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   105168k cached
---> STATUS: No problems, box is fine, no lag, etc..

2. append line = 2048M
top - 18:34:23 up 2 min,  2 users,  load average: 0.14, 0.14, 0.05
Tasks: 147 total,   1 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  1.2%sy,  0.4%ni, 95.2%id,  1.5%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   2059696k total,   956324k used,  1103372k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   102924k cached
---> STATUS: No problems, box is fine, no lag, etc..

3. append line = 4096M
top - 18:37:55 up 1 min,  1 user,  load average: 0.52, 0.19, 0.07
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.9%us,  2.2%sy,  0.7%ni, 91.6%id,  2.6%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   3339536k total,   949792k used,  2389744k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,99920k cached

$ time ssh p34 uptime
 19:00:16 up 1 min,  1 user,  load average: 0.67, 0.18, 0.06
real0m0.159s
user0m0.013s
sys 0m0.003s
---> STATUS: No problems, box is fine, no lag, etc..

4. append line = "" (use all 8GB)

top - 18:52:50 up 9 min,  1 user,  load average: 2.88, 2.43, 1.41
Tasks: 149 total,   3 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s): 36.3%us,  2.2%sy, 10.3%ni, 50.8%id,  0.4%wa,  0.0%hi,  0.1%si,  
0.0%st

Mem:   8104460k total,  1064416k used,  7040044k free, 3296k buffers
Swap: 16787768k total,0k used, 16787768k free,   201852k cached

$ ssh p34
ssh: connect to host p34 port 22: Connection refused

Machine takes 5-10 minutes to boot, it acts like a 286 computer, about 8 
minutes later:


$ time ssh p34 uptime  # 5 SECONDS!! 36x slower when using 8GB of RAM
 18:51:39 up 8 min,  1 user,  load average: 2.74, 2.31, 1.30

real0m5.757s
user0m0.015s
sys 0m0.004s

The machine is VERY slow and this is on a gigabit network, I/O does not 
seem to be affected but rather, CPU-bound processes.


  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2483 root  25   0 25324 5292 1072 R   96  0.1   4:37.12 mailgraph
 3604 logcheck  30  10  3408 1120  544 R   91  0.0   0:03.55 grep

These normally take seconds but when I use all 8GB of memory, it runs
for a very long time.

Conclusion: For now, I will be using mem=4096M until someone can help me 
understand what is happening here.  Can anyone offer any insight?


I found it interesting in make menuconfig on x86_64 there is no 4GB/64GB
options in the kernel that I remember seeing in 32bit.


That's because that option is not applicable in 64-bit mode.

Can you send your full dmesg output from the 8GB bootup?

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 2/2: PCI: disable decode of IO/memory during BAR sizing

2007-05-29 Thread Robert Hancock


Change PCI BAR sizing to disable the decode of memory or IO, as appropriate,
while we are writing the all-ones value to the BAR to determine the size.
If this is not done, the device may spuriously decode accesses to memory
areas it should not. On some Intel PCI Express chipsets, this breaks
MMCONFIG configuration space access, since the memory the graphics card ends up
decoding during this period overlaps the MMCONFIG area, and thus it steals
the accesses to the area to do any other configuration space access, including
changing the BAR back to its previous value.

However, don't do this disabling on host bridge devices, as it is reported that
some of them do silly things like disable CPU to RAM access if this is done.

Based on an original patch by Jesse Barnes.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.22-rc2-mm1/drivers/pci/probe.c2007-05-23 21:21:05.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/drivers/pci/probe.c2007-05-29 
21:31:47.0 -0600
@@ -180,6 +180,58 @@ static inline int is_64bit_memory(u32 ma
return 0;
}

+#define BAR_IS_MEMORY(bar) (((bar) & PCI_BASE_ADDRESS_SPACE) ==\
+   PCI_BASE_ADDRESS_SPACE_MEMORY)
+
+/**
+ * pci_bar_size - get raw PCI BAR size
+ * @dev: PCI device
+ * @reg: BAR to probe
+ *
+ * Use basic PCI probing:
+ *   - save original BAR value
+ *   - disable MEM or IO decode in PCI_COMMAND reg if appropriate
+ *   - write all 1s to the BAR
+ *   - read back value
+ *   - reenble MEM or IO decode as necessary
+ *   - write original value back
+ *
+ * Returns raw BAR size to caller.
+ */
+static u32 pci_bar_size(struct pci_dev *dev, unsigned int reg)
+{
+   u32 orig_reg, sz;
+   u16 orig_cmd;
+
+   pci_read_config_dword(dev, reg, _reg);
+   pci_read_config_word(dev, PCI_COMMAND, _cmd);
+
+   /*
+* Disable memory or IO decode on the device while writing the test
+* value to the BAR. This prevents possible spurious decoding
+* of random addresses by the device. Don't do this for host bridges,
+* however, since some of them do silly things like disable CPU to RAM
+* access if this is done.
+*/
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST) {
+   if (BAR_IS_MEMORY(orig_reg))
+   pci_write_config_word(dev, PCI_COMMAND,
+ orig_cmd & ~PCI_COMMAND_MEMORY);
+   else
+			pci_write_config_word(dev, PCI_COMMAND, 
+	  orig_cmd & ~PCI_COMMAND_IO);

+   }
+   
+   pci_write_config_dword(dev, reg, 0x);
+   pci_read_config_dword(dev, reg, );
+   pci_write_config_dword(dev, reg, orig_reg);
+
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
+
+   return sz;
+}
+
static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
{
unsigned int pos, reg, next;
@@ -196,16 +248,13 @@ static void pci_read_bases(struct pci_de
res->name = pci_name(dev);
reg = PCI_BASE_ADDRESS_0 + (pos << 2);
pci_read_config_dword(dev, reg, );
-   pci_write_config_dword(dev, reg, ~0);
-   pci_read_config_dword(dev, reg, );
-   pci_write_config_dword(dev, reg, l);
+   sz = pci_bar_size(dev, reg);
if (!sz || sz == 0x)
continue;
if (l == 0x)
l = 0;
raw_sz = sz;
-   if ((l & PCI_BASE_ADDRESS_SPACE) ==
-   PCI_BASE_ADDRESS_SPACE_MEMORY) {
+   if (BAR_IS_MEMORY(l)) {
sz = pci_size(l, sz, (u32)PCI_BASE_ADDRESS_MEM_MASK);
/*
 * For 64bit prefetchable memory sz could be 0, if the
@@ -229,9 +278,7 @@ static void pci_read_bases(struct pci_de
u32 szhi, lhi;

pci_read_config_dword(dev, reg+4, );
-   pci_write_config_dword(dev, reg+4, ~0);
-   pci_read_config_dword(dev, reg+4, );
-   pci_write_config_dword(dev, reg+4, lhi);
+   szhi = pci_bar_size(dev, reg+4);
sz64 = ((u64)szhi << 32) | raw_sz;
l64 = ((u64)lhi << 32) | l;
sz64 = pci_size64(l64, sz64, PCI_BASE_ADDRESS_MEM_MASK);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 0/2: PCI MMCONFIG-related updates

2007-05-29 Thread Robert Hancock


These two patches implement some changes in behavior related to PCI
MMCONFIG configuration space access. One changes the way in which we
validate the MCFG table provided by the BIOS by checking it against
ACPI motherboard resources instead of the E820 table. The BIOS is not
required to reserve this area in the E820 table, so checking that
results in MMCONFIG being unnecessarily disabled on some machines.

Some Intel chipsets where MMCONFIG was being disabled previously
(but won't be with the first patch) had problems, not due to the
MCFG table being broken, but because the access was hosed by the way
in which we do PCI BAR sizing. The second patch fixes this problem.

This is requested for inclusion in the -mm tree for testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-29 Thread Robert Hancock


This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware spec
apparently tells BIOS developers that reservation in ACPI is required and
E820 reservation is optional, so checking against ACPI first makes sense.
Many BIOSes don't reserve the MMCONFIG region in E820 even though it is
perfectly functional, the existing check needlessly disables MMCONFIG in
these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as before.
Otherwise, it is enabled later after the ACPI interpreter is enabled, since we
need to be able to execute control methods in order to check the ACPI reserved
resources. Presently this is just triggered off the end of ACPI interpreter
initialization.

There are a few other behavioral changes here:

-Validate all MMCONFIG configurations provided, not just the first one.

-Validate the entire required length of each configuration according to the
provided ending bus number is reserved, not just the minimum required 
allocation.

-Validate that the area is reserved even if we read it from the chipset directly
and not from the MCFG table. This catches the case where the BIOS didn't set the
location properly in the chipset and has mapped it over other things it 
shouldn't
have.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/init.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/init.c   2007-05-23 21:20:43.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c   2007-05-23 
21:31:50.0 -0600
@@ -12,7 +12,7 @@ static __init int pci_access_init(void)
type = pci_direct_probe();
#endif
#ifdef CONFIG_PCI_MMCONFIG
-   pci_mmcfg_init(type);
+   pci_mmcfg_early_init(type);
#endif
if (raw_pci_ops)
return 0;
diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:21:04.0 -0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:38:19.0 -0600
@@ -206,9 +206,77 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
}

-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+   void *data)
+{
+   struct resource *mcfg_res = data;
+   struct acpi_resource_address64 address;
+   acpi_status status;
+
+   if (res->type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+   struct acpi_resource_fixed_memory32 *fixmem32 =
+   >data.fixed_memory32;
+   if (!fixmem32)
+   return AE_OK;
+   if ((mcfg_res->start >= fixmem32->address) &&
+   (mcfg_res->end <= (fixmem32->address +
+  fixmem32->address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   }
+   if ((res->type != ACPI_RESOURCE_TYPE_ADDRESS32) &&
+   (res->type != ACPI_RESOURCE_TYPE_ADDRESS64))
+   return AE_OK;
+
+   status = acpi_resource_to_address64(res, );
+   if (ACPI_FAILURE(status) || (address.address_length <= 0) ||
+   (address.resource_type != ACPI_MEMORY_RANGE))
+   return AE_OK;
+
+   if ((mcfg_res->start >= address.minimum) &&
+   (mcfg_res->end <=
+(address.minimum +address.address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+   void *context, void **rv)
+{
+   struct resource *mcfg_res = context;
+
+   acpi_walk_resources(handle, METHOD_NAME__CRS,
+   check_mcfg_resource, context);
+
+   if (mcfg_res->flags)
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+   struct resource mcfg_res;
+
+   mcfg_res.start = start;
+   mcfg_res.end = end;
+   mcfg_res.flags = 0;
+
+   acpi_get_devices("PNP0C01", find_mboard_resource, _res, NULL);
+   
+   if( !mcfg_res.flags )
+   acpi_get_devices("PNP0C02", find_mboard_resource, _res, 
NULL);
+
+   return mcfg_res.flags;
+}
+
+static void __init pci_mmcfg_reject_broken(void)
{
typeof(pci_mmcfg_config[0]) *cfg;
+   int i;

if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@

Re: tty-related oops in latest kernel(s)?

2007-05-29 Thread Tero Roponen


[resend, mailer didn't like unzipped applications]

On Tue, 29 May 2007, Pekka Enberg wrote:

> Hi Tero,
> 
> On 5/29/07, Tero Roponen <[EMAIL PROTECTED]> wrote:
> > FYI, I just tested 2.6.21.3. I couldn't reproduce the problem with
> > that kernel.
> 

[snip] 

> > Warning: dev (tty4) tty->count(3) != #fd's(2) in release_dev
> > release_dev: driver.table[3] not tty for (tty4)
> 
> Presumably someone tries to close the file again which is why we get a
> new complaint that reference counting has gone bad.
> 
> Unfortunately, I have no idea why drivers->tty does not match. It
> could be a race with release_tty() or real use-after-free but I am
> unable to find anything obvious in 2.6.21 -> 2.6.22-rc3 that would
> break it. Doing the git bisect dance here would really help...

Hmmm, I just found something interesting. In 2.6.21.3 the /sbin/init
gets corrupted when I watch the video!

$ cp /sbin/init init.before
$ mplayer kiwi.flv
$ cp /sbin/init init.after

The sha1sums are here:

52c8d643057619cbe137b8e69d4709ce3bdd832d  init.after
8efc7864a5b535a9e336fa82e9d7f112f3d956c1  init.before

It seems that something corrupts memory somewhere...

I attached those files in case someone can figure out
what is happening.

-- 
Tero Roponen

init.before.gz
Description: GNU Zip compressed data


init.after.gz
Description: GNU Zip compressed data

Re: [PATCH] Support for controlling leds on xbox 360 pad.

2007-05-29 Thread Dmitry Torokhov

On Tuesday 29 May 2007 17:41, Richard Purdie wrote:
> On Tue, 2007-05-29 at 23:29 +0200, Jan Kratochvil wrote:
> >    I have question, probably for Richard. Why is 
> > /sys/class/leds/whatsoever/brightness mode set to 0644? Is it really 
> > necessary?
> > I feel like I'll be happy to allow anybody to change the state of this led. 
> > (Ok 
> > this maybe doesn't apply to other leds)
> 
> Permissions management of the LEDs is outside the scope of kernel. If
> you need users to have access to them, change the permissions in
> userspace to grant access.
> 

I will also venture to say that you only want to grant access to
the user currently logged on console, not any random user logged
in over the network so you really want to dynamically manage
premissions. The kernel just provides sensible defaults.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] lockstat: core infrastructure

2007-05-29 Thread Andrew Morton

On Tue, 29 May 2007 14:52:51 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> Introduce the core lock statistics code.
> 

I must say that an aggregate addition of 27 ifdefs is a bit sad.  And there
is some easy stuff here.

> +#ifdef CONFIG_PROVE_LOCKING
> +int prove_locking = 1;
> +module_param(prove_locking, int, 0644);
> +#endif

#else
#define prove_locking 0
#endif

> +
> +#ifdef CONFIG_LOCK_STAT
> +int lock_stat = 1;
> +module_param(lock_stat, int, 0644);
> +#endif

ditto.

>
> ...
>
> +struct lock_class_stats lock_stats(struct lock_class *class)
> +{
> + struct lock_class_stats stats;
> + int cpu, i;
> +
> + memset(, 0, sizeof(struct lock_class_stats));
> + for_each_possible_cpu(cpu) {
> + struct lock_class_stats *pcs =
> + _cpu(lock_stats, cpu)[class - lock_classes];
> +
> + for (i = 0; i < ARRAY_SIZE(stats.contention_point); i++)
> + stats.contention_point[i] += pcs->contention_point[i];
> +
> + lock_time_add(>read_waittime, _waittime);
> + lock_time_add(>write_waittime, _waittime);
> +
> + lock_time_add(>read_holdtime, _holdtime);
> + lock_time_add(>write_holdtime, _holdtime);
> + }
> +
> + return stats;
> +}

hm, that isn't trying to be very efficient.

> @@ -2035,6 +2131,11 @@ static int __lock_acquire(struct lockdep
>   int chain_head = 0;
>   u64 chain_key;
>  
> +#ifdef CONFIG_PROVE_LOCKING
> + if (!prove_locking)
> + check = 1;
> +#endif

Removable

> +#ifdef CONFIG_LOCK_STAT
> +static void lock_release_holdtime(struct held_lock *hlock)
> +{
> + struct lock_class_stats *stats;
> + unsigned long long holdtime;
> +
> + if (!lock_stat)
> + return;
> +
> + holdtime = sched_clock() - hlock->holdtime_stamp;
> +
> + stats = get_lock_stats(hlock->class);
> +
> + if (hlock->read)
> + lock_time_inc(>read_holdtime, holdtime);
> + else
> + lock_time_inc(>write_holdtime, holdtime);
> +
> + put_lock_stats(stats);
> +}
> +#else
> +static void lock_release_holdtime(struct held_lock *hlock)

inline

> +{
> +}
> +#endif
> +
> ...
>
> @@ -2456,6 +2712,14 @@ void lock_acquire(struct lockdep_map *lo
>  {
>   unsigned long flags;
>  
> +#ifdef CONFIG_LOCK_STAT
> + if (unlikely(!lock_stat))
> +#endif

removable

> +#ifdef CONFIG_PROVE_LOCKING
> + if (unlikely(!prove_locking))
> +#endif

removable

> @@ -2475,6 +2739,14 @@ void lock_release(struct lockdep_map *lo
>  {
>   unsigned long flags;
>  
> +#ifdef CONFIG_LOCK_STAT
> + if (unlikely(!lock_stat))
> +#endif

removable

> +#ifdef CONFIG_PROVE_LOCKING
> + if (unlikely(!prove_locking))
> +#endif
> + return;
> +
>   if (unlikely(current->lockdep_recursion))
>   return;
>  
>  
> ...
>
> +#ifdef CONFIG_LOCK_STAT
> +
> +extern void lock_contended(struct lockdep_map *lock, unsigned long ip);
> +extern void lock_acquired(struct lockdep_map *lock);
> +
> +#define LOCK_CONTENDED(_lock, try, lock) \
> +do { \
> + if (!try(_lock)) {  \
> + lock_contended(&(_lock)->dep_map, _RET_IP_);\
> + lock(_lock);\
> + lock_acquired(&(_lock)->dep_map);   \
> + }   \
> +} while (0)
> +
> +#else /* CONFIG_LOCK_STAT */
> +
> +#define lock_contended(l, i) do { } while (0)
> +#define lock_acquired(l) do { } while (0)

inlines are better.

> +#define LOCK_CONTENDED(_lock, try, lock) \
> + lock(_lock)
> +
> +#endif /* CONFIG_LOCK_STAT */
> +
>   },
>
> ...
>
> +#ifdef CONFIG_PROVE_LOCKING
> + {
> + .ctl_name   = KERN_PROVE_LOCKING,
> + .procname   = "prove_locking",
> + .data   = _locking,
> + .maxlen = sizeof(int),
> + .mode   = 0644,
> + .proc_handler   = _dointvec,
> + },
> +#endif
> +#ifdef CONFIG_LOCK_STAT
> + {
> + .ctl_name   = KERN_LOCK_STAT,
> + .procname   = "lock_stat",
> + .data   = _stat,
> + .maxlen = sizeof(int),
> + .mode   = 0644,
> + .proc_handler   = _dointvec,
> + },
> +#endif

Please use CTL_UNNUNBERED for new sysctls.

>   { .ctl_name = 0 }
>  };
> Index: linux-2.6-git/include/linux/sysctl.h
> ===
> --- linux-2.6-git.orig/include/linux/sysctl.h
> +++ linux-2.6-git/include/linux/sysctl.h
> @@ -166,6 +166,8 @@ enum
>   KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
>   KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
>

Re: BUG: sleeping function called from invalid context at kernel/fork.c:385

2007-05-29 Thread David Chinner

On Wed, May 23, 2007 at 10:44:16AM -0700, Luck, Tony wrote:
> > > Saw this when running strace -f on a script on 2.6.21 on ia64:
> > > 
> > > BUG: sleeping function called from invalid context at kernel/fork.c:385
> > > in_atomic():1, irqs_disabled():0
> ... snip ...
> > > I could reproduce it via 'strace -f sleep 1'
> > > 
> >
> > I'd say this is specific to ia64.   Someone would have spotted it on
> > x86 by now.
> 
> I tried the "strace -f sleep 1" on 2.6.22-rc2, and I didn't see this "BUG"
> there.  Can you try your other test cases on the latest kernel.  If it has
> already been fixed we can see about identifying the patch for possible 
> backport
> to 2.6.21.stable

Sorry for taking so long to get back to this - I still see this in
2.6.22-rc2.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] lockdep: sanitise CONFIG_PROVE_LOCKING

2007-05-29 Thread Andrew Morton

On Tue, 29 May 2007 16:16:17 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Christoph Hellwig <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, May 29, 2007 at 02:52:50PM +0200, Peter Zijlstra wrote:
> > > Ensure that all of the lock dependency tracking code is under
> > > CONFIG_PROVE_LOCKING. This allows us to use the held lock tracking code
> > > for other purposes.
> > 
> > There's an awfull lot of ifdefs introduced in this patch, I wonder 
> > whether it might be better to split up lockdep.c at those boundaries.
> 
> it adds 6 new #ifdefs. There's 35 #ifdefs in page_alloc.c, 44 in 
> sysctl.c and 64 in sched.c. I'd not call it 'an awful lot', although 
> certainly it could be reduced. Splitting lockdep.c up would uglify it 
> well beyond the impact of the 6 #ifdefs, given the amount of glue 
> needed.
> 

I'm not sure that we need to split lockdep.c, but it's a bit disappointing
that the patch didn't (couldn't?) move CONFIG_PROVE_LOCKING-only code and
data close together so that it can all fall within a single (or at least
fewer) ifdefs.

(Who came up with the (mis)name CONFIG_PROVE_LOCKING, btw?  Should have
been CONFIG_MIGHT_DISPROVE_LOCKING).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread David Wagner

[EMAIL PROTECTED] wrote:
> no, this won't help you much against local users, [...]

Pavel Machek  wrote:
>Hmm, I guess I'd love "it is useless on multiuser boxes" to become
>standard part of AA advertising.

That's not quite what david@ said.  As I understand it, AppArmor is not
focused on preventing attacks by local users against other local users;
that's not the main problem it is trying to solve.  Rather, it's primary
purpose is to deal with attacks by remote bad guys against your network
servers.  That is a laudable goal.  Anything that helps reduce the impact
of remote exploits is bound to be useful, even if doesn't do a darn
thing to stop local users from attacking each other.

This means that AppArmor could still be useful on multiuser boxes,
even if that utility is limited to defending (some) network daemons
against remote attack (or, more precisely, reducing the damage done by
a successful remote attack against a network daemon).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/20] Blackfin update for 2.6.22-rc3

2007-05-29 Thread Bryan Wu

On Tue, 2007-05-29 at 19:42 -0700, Andrew Morton wrote:
> On Wed, 30 May 2007 10:31:49 +0800 Bryan Wu <[EMAIL PROTECTED]> wrote:
> 
> > So, can we setup a git-tree in kernel.org?
> 
> http://kernel.org/faq/#account
> 
> Tell them we sent you ;)

Thanks a lot. Sorry for missing FAQ.
Best Regards,
-Bryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ckrm-tech] [RFC] [PATCH 0/3] Add group fairness to CFS

2007-05-29 Thread William Lee Irwin III

On Mon, May 28, 2007 at 10:09:19PM +0530, Srivatsa Vaddagiri wrote:
> What do these task weights control? Timeslice primarily? If so, I am not
> sure how well it can co-exist with cfs then (unless you are planning to
> replace cfs with a equally good interactive/fair scheduler :)
> I would be very interested if this weight calculation can be used for
> smpnice based load balancing purposes too ..

Task weights represent shares of CPU bandwidth. If task i has weight w_i
then its share of CPU bandwidth is intended to be w_i/sum_i w_i.

"Load weight" seems to be used more in the scheduler source.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/7] KAMEZAWA Hiroyuki - migration by kernel

2007-05-29 Thread Christoph Lameter

Looks good. I will ack it when I have a chance to test either your or 
Mel's patchset. Likely after the next iteration.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ckrm-tech] [RFC] [PATCH 0/3] Add group fairness to CFS

2007-05-29 Thread William Lee Irwin III

William Lee Irwin III wrote:
>> Lag is the deviation of a task's allocated CPU time from the CPU time
>> it would be granted by the ideal fair scheduling algorithm (generalized
>> processor sharing; take the limit of RR with per-task timeslices
>> proportional to load weight as the scale factor approaches zero).

On Wed, May 30, 2007 at 10:09:28AM +1000, Peter Williams wrote:
> Over what time period does this operate?

Over a period of time while the task is runnable.


William Lee Irwin III wrote:
>> Negative lag reflects receipt of excess CPU time. A close-to-canonical
>> "fairness metric" is the maximum of the absolute values of the lags of
>> all the tasks on the system. The "signed minimax pseudonorm" is the
>> largest lag without taking absolute values; it's a term I devised ad
>> hoc to describe the proposed algorithm.

On Wed, May 30, 2007 at 10:09:28AM +1000, Peter Williams wrote:
> So what you're saying is that you think dynamic priority (or its 
> equivalent) should be used for load balancing instead of static priority?

It doesn't do much in other schemes, but when fairness is directly
measured by the dynamic priority, it is a priori more meaningful.
This is not to say the net effect of using it is so different.


William Lee Irwin III wrote:
>> I've presented a coherent
>> algorithm. It may be that there's no demonstrable problem to solve.
>> On the other hand, if there really is a question as to how to load
>> balance in the presence of tasks pinned to cpus, I just answered it.
> 
On Wed, May 30, 2007 at 10:09:28AM +1000, Peter Williams wrote:
> Unless I missed something there's nothing in your suggestion that does 
> anything more about handling pinned tasks than is already done by the 
> load balancer.

There are two things, both of which are relatively subtle and
coincidentally happen to some extent. The first is the unpinned lag,
which behaves much differently from unpinned load weight even if it's
not so different in concept, but apparently achieves a similar result.
The second is the relativistic point of view, which happens somewhat by
coincidence anyway, but isn't formalized anywhere at all as a basis for
handling tasks pinned to cpus. The first difference is minor and the
second formalizing something that mostly or completely already happens.


William Lee Irwin III wrote:
>> There was a second issue raised to which I responded. I didn't stray
>> per se. I addressed a second topic in the post.

On Wed, May 30, 2007 at 10:09:28AM +1000, Peter Williams wrote:
> OK.
> To reiterate, I don't think that my suggestion is really necessary.  I 
> think that the current load balancing (stand fast a small bug that's 
> being investigated) will come up with a good distribution of tasks to 
> CPUs within the constraints imposed by any CPU affinity settings.

It's sort of like performance. If the numbers are already good enough,
my algorithm on that front need not be bothered with either.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CPU hotplug: system hang on CPU hot remove during `pfmon --system-wide'

2007-05-29 Thread Rusty Russell

On Tue, 2007-05-29 at 13:56 -0700, Linus Torvalds wrote:
> 
> On Mon, 28 May 2007, Srivatsa Vaddagiri wrote:
> >
> > So is it settled now on what approach we are going to follow (freezer 
> > vs lock based) for cpu hotplug? I thought that Linus was not favouring 
> > freezer 
> > based approach sometime back ..
> 
> As far as I'm concerned, we should
>  - use "preempt_disable()" to protect against CPU's coming and going 
>  - use "stop_machine()" or similar that already honors preemption, and 
>which I trust a whole lot  more than freezer.
>  - .. especially since this is already how we are supposed to be protected 
>against CPU's going away, and we've already started doing that (for an 
>example of this, see things like e18f3ffb9c from Andrew)

Indeed, this is how it was supposed to work.

Note that it is possible to make stop_machine() an even larger hammer,
by scheduler mods to flush all the preempted tasks.  This would drop the
requirement for preempt_disable().

But cute as that would be, I've been waiting until someone demonstrates
an actual need...

Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/20] Blackfin update for 2.6.22-rc3

2007-05-29 Thread Andrew Morton

On Wed, 30 May 2007 10:31:49 +0800 Bryan Wu <[EMAIL PROTECTED]> wrote:

> So, can we setup a git-tree in kernel.org?

http://kernel.org/faq/#account

Tell them we sent you ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/7] KAMEZAWA Hiroyuki - migration by kernel

2007-05-29 Thread KAMEZAWA Hiroyuki

On Tue, 29 May 2007 18:36:50 +0100 (IST)
Mel Gorman <[EMAIL PROTECTED]> wrote:

> 
> This is a patch from KAMEZAWA Hiroyuki for using page migration on remote
> processes without races. This patch is still undergoing development and
> is expected to be a pre-requisite for both memory hot-remove and memory
> compaction.
> 
> Changelog from KAMEZAWA Hiroyuki version
> o Removed the MIGRATION_BY_KERNEL as a compile-time option
> 
This is my latest version.
(not tested because caller of this function is being rewritten now..)

I'll move this patch to the top of my series and prepare to post this patch as
a single patch.

==
page migration by kernel v2.

Changelog V1 -> V2
 *removed atomic ops.
 *removes changes in anon_vma_free() and add check before calling it.
 *reflected feedback of review.
 *remove CONFIG_MIGRATION_BY_KERNEL

In usual, migrate_pages(page,,) is called with holoding mm->sem by systemcall.
(mm here is a mm_struct which maps the migration target page.)
This semaphore helps avoiding some race conditions.

But, if we want to migrate a page by some kernel codes, we have to avoid
some races. This patch adds check code for following race condition.

1. A page which is not mapped can be target of migration. Then, we have
   to check page_mapped() before calling try_to_unmap().

2. We can't trust page->mapping if page_mapcount() can goes down to 0.
   But when we map newpage back to original ptes, we have to access
   anon_vma from a page, which page_mapcount() is 0.
   This patch adds a special refcnt to anon_vma, which is synced by
   anon_vma->lock and delays freeing anon_vma.

Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>


---
 include/linux/migrate.h |5 -
 include/linux/rmap.h|   11 +++
 mm/migrate.c|   35 +--
 mm/rmap.c   |   36 +++-
 4 files changed, 79 insertions(+), 8 deletions(-)

Index: linux-2.6.22-rc2-mm1/mm/migrate.c
===
--- linux-2.6.22-rc2-mm1.orig/mm/migrate.c
+++ linux-2.6.22-rc2-mm1/mm/migrate.c
@@ -607,11 +607,12 @@ static int move_to_new_page(struct page 
  * to the newly allocated page in newpage.
  */
 static int unmap_and_move(new_page_t get_new_page, unsigned long private,
-   struct page *page, int force)
+   struct page *page, int force, int nocontext)
 {
int rc = 0;
int *result = NULL;
struct page *newpage = get_new_page(page, private, );
+   struct anon_vma *anon_vma = NULL;
 
if (!newpage)
return -ENOMEM;
@@ -632,17 +633,23 @@ static int unmap_and_move(new_page_t get
goto unlock;
wait_on_page_writeback(page);
}
-
+   /* hold this anon_vma until page migration ends */
+   if (nocontext && PageAnon(page) && page_mapped(page))
+   anon_vma = anon_vma_hold(page);
/*
 * Establish migration ptes or remove ptes
 */
-   try_to_unmap(page, 1);
+   if (page_mapped(page))
+   try_to_unmap(page, 1);
+
if (!page_mapped(page))
rc = move_to_new_page(newpage, page);
 
if (rc)
remove_migration_ptes(page, page);
 
+   anon_vma_release(anon_vma);
+
 unlock:
unlock_page(page);
 
@@ -686,8 +693,8 @@ move_newpage:
  *
  * Return: Number of pages not migrated or error code.
  */
-int migrate_pages(struct list_head *from,
-   new_page_t get_new_page, unsigned long private)
+int __migrate_pages(struct list_head *from,
+   new_page_t get_new_page, unsigned long private, int nocontext)
 {
int retry = 1;
int nr_failed = 0;
@@ -707,7 +714,7 @@ int migrate_pages(struct list_head *from
cond_resched();
 
rc = unmap_and_move(get_new_page, private,
-   page, pass > 2);
+   page, pass > 2, nocontext);
 
switch(rc) {
case -ENOMEM:
@@ -737,6 +744,22 @@ out:
return nr_failed + retry;
 }
 
+int migrate_pages(struct list_head *from,
+   new_page_t get_new_page, unsigned long private)
+{
+   return __migrate_pages(from, get_new_page, private, 0);
+}
+
+/*
+ * When page migration is issued by the kernel itself without page mapper's
+ * mm->sem, we have to be more careful to do page migration.
+ */
+int migrate_pages_nocontext(struct list_head *from,
+   new_page_t get_new_page, unsigned long private)
+{
+   return __migrate_pages(from, get_new_page, private, 1);
+}
+
 #ifdef CONFIG_NUMA
 /*
  * Move a list of individual pages
Index: linux-2.6.22-rc2-mm1/include/linux/rmap.h
===
--- linux-2.6.22-rc2-mm1.orig/include/linux/rmap.h
+++

Re: 2.6.22-rc3: regression: make M=$PWD modules_install does nothing

2007-05-29 Thread Andrey Borzenkov

On Wednesday 30 May 2007, Sam Ravnborg wrote:
> Hi Andrey
>
> > This has been working up and including -rc2:
>
> I have tried to reproduce here - but it just works.
>
> I then took a second look at your excellent bug-report and noticed:
> > {pts/1}% sudo make -C ~/src/linux-git O=$HOME/build/linux-2.6.22 M=PWD
> > V=1
>
> that in this commandline you say M=PWD
>
> > modules_install
> > make: Entering directory `/home/bor/src/linux-git'
> > make -C /home/bor/build/linux-2.6.22 \
> > KBUILD_SRC=/home/bor/src/linux-git \
> > KBUILD_EXTMOD="PWD" -f /home/bor/src/linux-git/Makefile
>
> which tell kbuild that module is in a directory named 'PWD' as seen in the
> assignment to KBUILD_EXTMOD in the above line.
>
> Could it be that you by accident omitted a '$' in front of PWD?
> This would then tell kbuild that module are in the directory pointed to by
> $PWD.
>

You are right, I am sorry for false alarm. One should not do these things late 
night.

ashamed ...

-andrey


signature.asc
Description: This is a digitally signed message part.

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread Toshiharu Harada

2007/5/29, Kyle Moffett <[EMAIL PROTECTED]>:

>>> But writing policy with labels are somewhat indirect way (I mean,
>>> we need "ls -Z" or "ps -Z").  Indirect way can cause flaw so we
>>> need a lot of work that is what I wanted to tell.
>>
>> I don't really use "ls -Z" or "ps -Z" when writing SELinux policy; I
>> do that only when I actually think I mislabeled files.
>
> I believe what you wrote, but it may not be as easy for average
> Linux users.

As I said before, average Linux users should not be writing their own
security policy.  I have yet to meet an "average Linux user" who
could reliably quote for me what the file permissions on the "/tmp"
directory should be, or what the sticky bit was.  A small percentage
of average Linux system administrators don't get that right
consistently, and if you don't understand the sticky bit then you
should *definitely* not be controlling program permissions on a per-
syscall basis.

Thank you for your detailed and thoughtful explanation.
Things are much clear now for me. Although your explanation was
quite persuasive, I still have some concerns.

Linux is now being used literately everywhere. As devices, technologies and
Linux itself is evolving so quickly, I'm afraid the way you showed was right
but could never meet the every goal perfectly. So some areas, including
embedded and special distro I guess, there must be solutions and help  for
average level administrators.

I think there are two ways to make secure systems.  One is just
you wrote: "ask it professionals" way, the other is "making practices".
You might ask "how?" My answer to the question is pahtname-based
systems such as AppAmor and TOMOYO Linux.
They can't be compared to SELinux, but they should be considered to
supplemental tools.  At least they are helpful to analyze how Linux works.
Tweeking SELinux policy is not easy but writing policies for
them is relatively easy (I'm not talking about security here).

Not everybody can be a professional administrators, but he/she can be a
professional administrator of his/her system.  I believe there must be
solutions for non professional administrators.  That's why we developed
TOMOYO Linux (http://tomoyo.sourceforge.jp/) and so was AppArmor
I guess.  You might laugh, but we are doing this because we want to
contribute to Linux and its community. :)

Thanks,
Toshiharu Harada
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/20] Blackfin update for 2.6.22-rc3

2007-05-29 Thread Bryan Wu

On Tue, 2007-05-29 at 13:59 -0700, Linus Torvalds wrote:
> 
> On Mon, 28 May 2007, Bryan Wu wrote:
> > 
> >  - Blackfin arch update including BF54x initial supporting
> >  - Blackfin driver update: serial/spi/rtc
> >  - Provide new Blackfin watchdog driver
> >  - binfmt_flat.c for Blackfin arch modification
> 
> I realize that this all just touches blackfin-specific stuff, but after 
> -rc3 I really prefer not to bother with these things..
> 

Yes, I see. It should be maintained by git-merge or git-pull easily.

> Also, for stuff that is really just an architecture that I can't even 
> test, and where there is a clear maintainership thing, I'd actually prefer 
> to just do a git merge, if possible. It's not like I will likely start 
> looking at some blackfin-specific patches. Judging from the diffs, you do 
> actually use git, do you have a place where you could export these kinds 
> of patch-series as a git tree instead?

We are waiting for your command to do this, -:)).
So, can we setup a git-tree in kernel.org?

Thanks a lot
Best Regards,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v12

2007-05-29 Thread Peter Williams


Siddha, Suresh B wrote:

On Tue, May 29, 2007 at 04:54:29PM -0700, Peter Williams wrote:

I tried with various refresh rates of top too.. Do you see the issue
at runlevel 3 too?

I haven't tried that.

Do your spinners ever relinquish the CPU voluntarily?


Nope. Simple and plain while(1); 's

I can try 32-bit kernel to check.


Don't bother.  I just checked 2.6.22-rc3 and the problem is not present 
which means something between rc2 and rc3 has fixed the problem.  I hate 
it when problems (appear to) fix themselves as it usually means they're 
just hiding.


I didn't see any patches between rc2 and rc3 that were likely to have 
fixed this (but doesn't mean there wasn't one).  I'm wondering whether I 
should do a git bisect to see if I can find where it got fixed?


Could you see if you can reproduce it on 2.6.22-rc2?

Thanks
Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.22-rc2] BUG message during boot.

2007-05-29 Thread Christoph Lameter

On Thu, 24 May 2007, "\"Tetsuo Handa\"" wrote:

> It seems it is harmless because the system can continue running,
> but may be something bad?

Yes it may be harmless... Some thing is asking the slab allocator for 
an object of 0 bytes. 

> Compat vDSO mapped to e000.
> BUG: at mm/slab.c:777 __find_general_cachep()
>  [] __kmalloc+0x4d/0xc8
>  [] init_table+0x19/0x4a
>  [] mtrr_bp_init+0x176/0x18e
>  [] check_bugs+0x5/0x3b
>  [] start_kernel+0x1e2/0x1eb
>  [] unknown_bootoption+0x0/0x181
>  ===
> Checking 'hlt' instruction... OK.

The fix is in linux-2.6.22-rc2-mm1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kexec and aacraid broken

2007-05-29 Thread Andrew Morton

On Tue, 29 May 2007 18:59:32 -0700 "Yinghai Lu" <[EMAIL PROTECTED]> wrote:

> latest tree, can not use kexec to load 2.6.22-rc3 at least.
> 
> got:
> 
> AAC0: adapter kernel panic'd fffd
> AAC0: adapter kernel failed to start, init status=0

One of the two diffs below, I guess.  Please do a `patch -R -p1' of this
email and retest?

> 
> but can load 2.6.21.3
> 

Michal, can you please add this to the regression list?




commit 9e4d4a5d71d673901d9c1df5146ce545c2cc0cc0
Author: Salyzyn, Mark <[EMAIL PROTECTED]>
Date:   Tue May 1 11:43:06 2007 -0400

[SCSI] aacraid: superfluous adapter reset for IBM 8 series ServeRAID 
controllers

The kexec patch introduced a superfluous (and otherwise inert) reset of
some adapters. The register can have a hardware default value that has
zeros for the undefined interrupts. This patch refines the test of the
interrupt enable register to focus on only the interrupts that affect
the driver in order to detect if an incomplete shutdown of the Adapter
had occurred (kdump).

Signed-off-by: Mark Salyzyn <[EMAIL PROTECTED]>
Signed-off-by: James Bottomley <[EMAIL PROTECTED]>

diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index b6ee3c0..291cd14 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -542,7 +542,7 @@ int _aac_rx_init(struct aac_dev *dev)
dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
-   if status & 0xff) != 0xff) || reset_devices) &&
+   if status & 0x0c) != 0x0c) || reset_devices) &&
  !aac_rx_restart_adapter(dev, 0))
++restart;
/*
commit a5694ec545a880f9d23463fddc894f5096cc68fa
Author: Salyzyn, Mark <[EMAIL PROTECTED]>
Date:   Mon Apr 30 13:22:24 2007 -0400

[SCSI] aacraid: kexec fix (reset interrupt handler)

Another layer on this onion also discovered by Duane, the
interrupt enable handler also needed to be set ... The interrupt enable
was called from within the synchronous command handler.

Signed-off-by: Mark Salyzyn <[EMAIL PROTECTED]>
Signed-off-by: James Bottomley <[EMAIL PROTECTED]>

diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index 0c71315..b6ee3c0 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -539,6 +539,8 @@ int _aac_rx_init(struct aac_dev *dev)
}
 
/* Failure to reset here is an option ... */
+   dev->a_ops.adapter_sync_cmd = rx_sync_cmd;
+   dev->a_ops.adapter_enable_int = aac_rx_disable_interrupt;
dev->OIMR = status = rx_readb (dev, MUnit.OIMR);
if status & 0xff) != 0xff) || reset_devices) &&
  !aac_rx_restart_adapter(dev, 0))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kexec and aacraid broken

2007-05-29 Thread Yinghai Lu


latest tree, can not use kexec to load 2.6.22-rc3 at least.

got:

AAC0: adapter kernel panic'd fffd
AAC0: adapter kernel failed to start, init status=0


but can load 2.6.21.3


YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] serial: convert early_uart to earlycon for 8250

2007-05-29 Thread Yinghai Lu

[PATCH 4/5] serial: convert early_uart to earlycon for 8250

Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h. the serial8250_ports need to be probed late
in serial initializing stage. the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time.
need to wait till uart_add_one_port in drivers/serial/serial_core.c to call
register_console to get console ttyS0. that is too late.

Make early_uart to use early_param, so uart console can be used earlier.
Make it to be bootconsole with CON_BOOT flag, so can use console handover
feature. and it will switch to corresponding normal serial console
automatically.

new command line will be:
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xff5e,115200n8
or
earlycon=uart8250,io,0x3f8,9600n8
earlycon=uart8250,mmio,0xff5e,115200n8

it will print in very early stage:
Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
later for console it will print:
console handover: boot [uart0] -> real [ttyS0]

Signed-off-by: <[EMAIL PROTECTED]>

 Documentation/kernel-parameters.txt |   11 +++
 arch/ia64/kernel/setup.c|4 -
 drivers/serial/8250.c   |   28 ++---
 drivers/serial/8250_early.c |  104 ++--
 drivers/serial/Kconfig  |   10 +++
 include/asm-i386/fixmap.h   |2 
 include/asm-i386/io.h   |   13 
 include/asm-ia64/io.h   |4 +
 include/asm-x86_64/fixmap.h |4 +
 include/asm-x86_64/io.h |   13 
 include/linux/console.h |2 
 include/linux/serial.h  |6 --
 include/linux/serial_8250.h |3 +
 init/main.c |5 +
 kernel/printk.c |   22 +++
 15 files changed, 133 insertions(+), 98 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index aae2282..b51fbac 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -464,13 +464,20 @@ and is between 256 and 4096 characters. It is defined in 
the file
Documentation/networking/netconsole.txt for an
alternative.
 
-   uart,io,[,options]
-   uart,mmio,[,options]
+   uart8250,io,[,options]
+   uart8250,mmio,[,options]
Start an early, polled-mode console on the 8250/16550
UART at the specified I/O port or MMIO address,
switching to the matching ttyS device later.  The
options are the same as for ttyS, above.
 
+   earlycon=   [KNL] Output early console device and options.
+   uart8250,io,[,options]
+   uart8250,mmio,[,options]
+   Start an early, polled-mode console on the 8250/16550
+   UART at the specified I/O port or MMIO address.
+   The options are the same as for ttyS, above.
+
cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
Format:
,,,[,]
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index eaa6a24..dd7f95b 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -390,10 +390,6 @@ early_console_setup (char *cmdline)
if (!efi_setup_pcdp_console(cmdline))
earlycons++;
 #endif
-#ifdef CONFIG_SERIAL_8250_CONSOLE
-   if (!early_serial_console_init(cmdline))
-   earlycons++;
-#endif
 
return (earlycons) ? 0 : -1;
 }
diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index c84dab0..0b3ec38 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -2514,12 +2514,18 @@ static int __init serial8250_console_setup(struct 
console *co, char *options)
return uart_set_options(port, co, baud, parity, bits, flow);
 }
 
+static int __init serial8250_console_early_setup(void)
+{
+   return serial8250_find_port_for_earlycon();
+}
+
 static struct uart_driver serial8250_reg;
 static struct console serial8250_console = {
.name   = "ttyS",
.write  = serial8250_console_write,
.device = uart_console_device,
.setup  = serial8250_console_setup,
+   .early_setup= serial8250_console_early_setup,
.flags  = CON_PRINTBUFFER,
.index  = -1,
.data   = _reg,
@@ -2533,7 +2539,7 @@ static int __init serial8250_console_init(void)
 }
 console_initcall(serial8250_console_init);
 
-static int __init find_port(struct uart_port *p)
+int serial8250_find_port(struct uart_port *p)
 {
int line;
struct uart_port *port;
@@ -2546,26

[PATCH 5/5] serial: set DTR in uart for kernel serial console

2007-05-29 Thread Yinghai Lu

[PATCH 5/5] serial: set DTR in uart for kernel serial console

Some UARTs on other side need host uart DTR is set, otherwise will not
receive char from the host that kernel is runing during kernel boot stage.

BTW:
earlyprintk and early_uart are hard coded to set DTR/RTS.

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Bjorn Helgaas <[EMAIL PROTECTED]>

diff --git a/drivers/serial/serial_core.c b/drivers/serial/serial_core.c
index 326020f..bec5eb5 100644
--- a/drivers/serial/serial_core.c
+++ b/drivers/serial/serial_core.c
@@ -2303,8 +2303,14 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *port)
 * It may be that the port was not available.
 */
if (port->type != PORT_UNKNOWN &&
-   port->cons && !(port->cons->flags & CON_ENABLED))
+   port->cons && !(port->cons->flags & CON_ENABLED)) {
+   /*
+* some uarts on other side don't support no flow control.
+* So we set DTR in host uart to make them happy  --- YHLU
+*/
+   port->mctrl |= TIOCM_DTR;
register_console(port->cons);
+   }
 
/*
 * Ensure UPF_DEAD is not set.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] console: console handover to preferred console

2007-05-29 Thread Yinghai Lu

[PATCH 2/5] console: console handover to preferred console

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/kernel/printk.c b/kernel/printk.c
index 0bbdeac..7b96cae 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -985,12 +1007,15 @@ void register_console(struct console *console)
if (!(console->flags & CON_ENABLED))
return;
 
-   if (bootconsole) {
+   if (bootconsole && (console->flags & CON_CONSDEV)) {
printk(KERN_INFO "console handover: boot [%s%d] -> real 
[%s%d]\n",
   bootconsole->name, bootconsole->index,
   console->name, console->index);
unregister_console(bootconsole);
console->flags &= ~CON_PRINTBUFFER;
+   } else {
+   printk(KERN_INFO "console [%s%d] enabled\n",
+  console->name, console->index);
}
 
/*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] x86: initial fixmap support

2007-05-29 Thread Yinghai Lu

[PATCH 3/5] x86: initial fixmap support

From: "Eric W. Biederman" <[EMAIL PROTECTED]>

needed to get fixed virtual address for USB debug port and earlycon with mmio.

Signed-off-by: Eric W. Biderman <[EMAIL PROTECTED]>
Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index f74dfc4..8271466 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -168,6 +168,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
 .section .init.text,"ax",@progbits
 #endif
 
+   /* Do an early initialization of the fixmap area */
+   movl $(swapper_pg_dir - __PAGE_OFFSET), %edx
+   movl $(swapper_pg_pmd - __PAGE_OFFSET), %eax
+   addl $0x007, %eax   /* 0x007 = PRESENT+RW+USER */
+   movl %eax, 4092(%edx)
+
 #ifdef CONFIG_SMP
 ENTRY(startup_32_smp)
cld
@@ -507,6 +513,8 @@ ENTRY(_stext)
 .section ".bss.page_aligned","w"
 ENTRY(swapper_pg_dir)
.fill 1024,4,0
+ENTRY(swapper_pg_pmd)
+   .fill 1024,4,0
 ENTRY(empty_zero_page)
.fill 4096,1,0
 
diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S
index 1fab487..941c84b 100644
--- a/arch/x86_64/kernel/head.S
+++ b/arch/x86_64/kernel/head.S
@@ -73,7 +73,11 @@ startup_64:
addq%rbp, init_level4_pgt + (511*8)(%rip)
 
addq%rbp, level3_ident_pgt + 0(%rip)
+
addq%rbp, level3_kernel_pgt + (510*8)(%rip)
+   addq%rbp, level3_kernel_pgt + (511*8)(%rip)
+
+   addq%rbp, level2_fixmap_pgt + (506*8)(%rip)
 
/* Add an Identity mapping if I am above 1G */
leaq_text(%rip), %rdi
@@ -314,7 +318,16 @@ NEXT_PAGE(level3_kernel_pgt)
.fill   510,8,0
/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
.quad   level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
-   .fill   1,8,0
+   .quad   level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+
+NEXT_PAGE(level2_fixmap_pgt)
+   .fill   506,8,0
+   .quad   level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+   /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */
+   .fill   5,8,0
+
+NEXT_PAGE(level1_fixmap_pgt)
+   .fill   512,8,0
 
 NEXT_PAGE(level2_ident_pgt)
/* Since I easily can, map the first 1G.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] console: more buf for index parsing

2007-05-29 Thread Yinghai Lu

[PATCH 1/5] console: more buf for index parsing

change name to buf according to the usage as name + index

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/kernel/printk.c b/kernel/printk.c
index 0bbdeac..4961410 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -654,7 +654,7 @@ static void call_console_drivers(unsigned long start, 
unsigned long end)
  */
 static int __init console_setup(char *str)
 {
-   char name[sizeof(console_cmdline[0].name)];
+   char buf[sizeof(console_cmdline[0].name) + 4]; /* 4 for index */
char *s, *options;
int idx;
 
@@ -662,27 +662,27 @@ static int __init console_setup(char *str)
 * Decode str into name, index, options.
 */
if (str[0] >= '0' && str[0] <= '9') {
-   strcpy(name, "ttyS");
-   strncpy(name + 4, str, sizeof(name) - 5);
+   strcpy(buf, "ttyS");
+   strncpy(buf + 4, str, sizeof(buf) - 5);
} else {
-   strncpy(name, str, sizeof(name) - 1);
+   strncpy(buf, str, sizeof(buf) - 1);
}
-   name[sizeof(name) - 1] = 0;
+   buf[sizeof(buf) - 1] = 0;
if ((options = strchr(str, ',')) != NULL)
*(options++) = 0;
 #ifdef __sparc__
if (!strcmp(str, "ttya"))
-   strcpy(name, "ttyS0");
+   strcpy(buf, "ttyS0");
if (!strcmp(str, "ttyb"))
-   strcpy(name, "ttyS1");
+   strcpy(buf, "ttyS1");
 #endif
-   for (s = name; *s; s++)
+   for (s = buf; *s; s++)
if ((*s >= '0' && *s <= '9') || *s == ',')
break;
idx = simple_strtoul(s, NULL, 10);
*s = 0;
 
-   add_preferred_console(name, idx, options);
+   add_preferred_console(buf, idx, options);
return 1;
 }
 __setup("console=", console_setup);
@@ -709,7 +709,7 @@ int __init add_preferred_console(char *name, int idx, char 
*options)
 *  See if this tty is not yet registered, and
 *  if we have a slot free.
 */
-   for(i = 0; i < MAX_CMDLINECONSOLES && console_cmdline[i].name[0]; i++)
+   for (i = 0; i < MAX_CMDLINECONSOLES && console_cmdline[i].name[0]; i++)
if (strcmp(console_cmdline[i].name, name) == 0 &&
  console_cmdline[i].index == idx) {
selected_console = i;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[resend] [AGPGART] intel_agp: use table for device probe

2007-05-29 Thread Wang Zhenyu


Fixed issues noted by Christoph Hellwig, and I changed device table
scan a bit to allow the case that some models of graphics chips may
have same host bridge type. This type of chip will be added in the future. 

This patch cleans up device probe function. Eric Anholt was the original author.

Signed-off-by: Wang Zhenyu <[EMAIL PROTECTED]>
---
 drivers/char/agp/intel-agp.c |  300 ++
 1 files changed, 98 insertions(+), 202 deletions(-)

diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index e12f579..dcb6d4f 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -1717,41 +1717,92 @@ static const struct agp_bridge_driver intel_7505_driver 
= {
.agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
-static int find_i810(u16 device)
-{
-   struct pci_dev *i810_dev;
 
-   i810_dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, NULL);
-   if (!i810_dev)
-   return 0;
-   intel_private.pcidev = i810_dev;
-   return 1;
-}
-
-static int find_i830(u16 device)
+static int find_gmch(u16 device)
 {
-   struct pci_dev *i830_dev;
+   struct pci_dev *gmch_device;
 
-   i830_dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, NULL);
-   if (i830_dev && PCI_FUNC(i830_dev->devfn) != 0) {
-   i830_dev = pci_get_device(PCI_VENDOR_ID_INTEL,
-   device, i830_dev);
+   gmch_device = pci_get_device(PCI_VENDOR_ID_INTEL, device, NULL);
+   if (gmch_device && PCI_FUNC(gmch_device->devfn) != 0) {
+   gmch_device = pci_get_device(PCI_VENDOR_ID_INTEL,
+device, gmch_device);
}
 
-   if (!i830_dev)
+   if (!gmch_device)
return 0;
 
-   intel_private.pcidev = i830_dev;
+   intel_private.pcidev = gmch_device;
return 1;
 }
 
+/* Table to describe Intel GMCH and AGP/PCIE GART drivers.  At least one of
+ * driver and gmch_driver must be non-null, and find_gmch will determine
+ * which one should be used if a gmch_chip_id is present.
+ */
+static const struct intel_driver_description {
+   unsigned int chip_id;
+   unsigned int gmch_chip_id;
+   char *name;
+   const struct agp_bridge_driver *driver;
+   const struct agp_bridge_driver *gmch_driver;
+} intel_agp_chipsets[] = {
+   { PCI_DEVICE_ID_INTEL_82443LX_0, 0, "440LX", _generic_driver, 
NULL },
+   { PCI_DEVICE_ID_INTEL_82443BX_0, 0, "440BX", _generic_driver, 
NULL },
+   { PCI_DEVICE_ID_INTEL_82443GX_0, 0, "440GX", _generic_driver, 
NULL },
+   { PCI_DEVICE_ID_INTEL_82810_MC1, PCI_DEVICE_ID_INTEL_82810_IG1, "i810",
+   NULL, _810_driver },
+   { PCI_DEVICE_ID_INTEL_82810_MC3, PCI_DEVICE_ID_INTEL_82810_IG3, "i810",
+   NULL, _810_driver },
+   { PCI_DEVICE_ID_INTEL_82810E_MC, PCI_DEVICE_ID_INTEL_82810E_IG, "i810",
+   NULL, _810_driver },
+   { PCI_DEVICE_ID_INTEL_82815_MC, PCI_DEVICE_ID_INTEL_82815_CGC, "i815",
+   _810_driver, _815_driver },
+   { PCI_DEVICE_ID_INTEL_82820_HB, 0, "i820", _820_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82820_UP_HB, 0, "i820", _820_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82830_HB, PCI_DEVICE_ID_INTEL_82830_CGC, "830M",
+   _830mp_driver, _830_driver },
+   { PCI_DEVICE_ID_INTEL_82840_HB, 0, "i840", _840_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82845_HB, 0, "845G", _845_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82845G_HB, PCI_DEVICE_ID_INTEL_82845G_IG, "830M",
+   _845_driver, _830_driver },
+   { PCI_DEVICE_ID_INTEL_82850_HB, 0, "i850", _850_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82855PM_HB, 0, "855PM", _845_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82855GM_HB, PCI_DEVICE_ID_INTEL_82855GM_IG, 
"855GM",
+   _845_driver, _830_driver },
+   { PCI_DEVICE_ID_INTEL_82860_HB, 0, "i860", _860_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82865_HB, PCI_DEVICE_ID_INTEL_82865_IG, "865",
+   _845_driver, _830_driver },
+   { PCI_DEVICE_ID_INTEL_82875_HB, 0, "i875", _845_driver, NULL },
+   { PCI_DEVICE_ID_INTEL_82915G_HB, PCI_DEVICE_ID_INTEL_82915G_IG, "915G",
+   _845_driver, _915_driver },
+   { PCI_DEVICE_ID_INTEL_82915GM_HB, PCI_DEVICE_ID_INTEL_82915GM_IG, 
"915GM",
+   _845_driver, _915_driver },
+   { PCI_DEVICE_ID_INTEL_82945G_HB, PCI_DEVICE_ID_INTEL_82945G_IG, "945G",
+   _845_driver, _915_driver },
+   { PCI_DEVICE_ID_INTEL_82945GM_HB, PCI_DEVICE_ID_INTEL_82945GM_IG, 
"945GM",
+   _845_driver, _915_driver },
+   { PCI_DEVICE_ID_INTEL_82946GZ_HB, PCI_DEVICE_ID_INTEL_82946GZ_IG, 
"946GZ",
+   _845_driver, _i965_driver },
+   { PCI_DEVICE_ID_INTEL_82965G_1_HB, PCI_DEVICE_ID_INTEL_82965G_1_IG, 
"965G",
+   _845_driver, _i965_driver },
+   { PCI_DEVICE_ID_INTEL_82965Q_HB,

Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

2007-05-29 Thread Eric W. Biederman

"Albert Cahalan" <[EMAIL PROTECTED]> writes:

> On 5/29/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> "Albert Cahalan" <[EMAIL PROTECTED]> writes:
>
> That's not what I mean. (the "-e" causes that of course)
> I'm asking about the parent-child relationships shown.
> The "-H" option is a bit different from the "f" option.

Yes.  Sorry on the unmodified ps the parent-child relationship
seems to be displayed properly.  

>>> I'd be a lot happier about breaking compatibility in this area
>>> if I could get a functional adoption flag. That is, I really
>>> would like to show a process as child of init if it naturally
>>> was created as a child of init. It's less informative to have
>>> fake children showing up the same as real ones. The original
>>> parent PID would do. (BTW, the original parent name and/or
>>> grandparent PID would be great to have) As a bonus, the kernel
>>> could reap these processes more quickly than init can... and
>>> then maybe we can stop caring if init is alive.
>>
>> Having the kernel not reparent user processes to init is an interesting
>> idea, especially when those processes have not existed.  I'm not
>> certain that is POSIX complaint and otherwise backwards compatible.
>
> I'm not suggesting that this be visible via POSIX APIs.
>
> It's almost certainly a given that getppid() must return 1, and
> probably /proc needs to show this as well. Without question,
> any process created by init must be reaped by init.
>
> Processes NOT created by init could be silently reaped by
> the kernel. They need to see their own PPID as 1, but there
> need not be any parent-child relationship in the kernel data
> structures. The kernel can fake the whole thing, which is nice
> because then the kernel isn't depending on userspace to
> correctly perform the pointless action of playing with zombies.
> (might setting the death signal to 0 be useful here?)
>
> For "ps fax" and such, I'd like to distinguish between init's
> real and adopted children. Right now the adopted children
> look like they were created by init, which is not true. I only
> need a simple boolean flag, set upon reparenting, to tell me.
> Such a flag may also be useful for optimizing away the whole
> wait/waitpid/wait4/waitid/wait3 nonsense when an adopted
> child dies.

I will keep it in mind.  A simple this process has been reparented
flag probably won't be too bad.   As for the rest I'm not certain.

With pid namespaces there is a certain sense in doing something like
this, but I'm not certain /sbin/init and all of it's replacements
don't care (although admittedly it would be a stretch to tell the
difference).

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[resend] [AGPGART] intel_agp: cleanup intel private data

2007-05-29 Thread Wang Zhenyu


Remove volatile type declare for IO mem variables.

A single private gart data is used by all drivers, this
makes it clean. Eric Anholt wrote the original patch.

Signed-off-by: Wang Zhenyu <[EMAIL PROTECTED]>
---
 drivers/char/agp/intel-agp.c |  195 --
 1 files changed, 91 insertions(+), 104 deletions(-)

diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index 9c69f2e..e12f579 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -86,11 +86,18 @@ static struct gatt_mask intel_i810_masks[] =
 .type = INTEL_AGP_CACHED_MEMORY}
 };
 
-static struct _intel_i810_private {
-   struct pci_dev *i810_dev;   /* device one */
-   volatile u8 __iomem *registers;
+static struct _intel_private {
+   struct pci_dev *pcidev; /* device one */
+   u8 __iomem *registers;
+   u32 __iomem *gtt;   /* I915G */
int num_dcache_entries;
-} intel_i810_private;
+   /* gtt_entries is the number of gtt entries that are already mapped
+* to stolen memory.  Stolen memory is larger than the memory mapped
+* through gtt_entries, as it includes some reserved space for the BIOS
+* popup and for the GTT.
+*/
+   int gtt_entries;/* i830+ */
+} intel_private;
 
 static int intel_i810_fetch_size(void)
 {
@@ -127,32 +134,32 @@ static int intel_i810_configure(void)
 
current_size = A_SIZE_FIX(agp_bridge->current_size);
 
-   if (!intel_i810_private.registers) {
-   pci_read_config_dword(intel_i810_private.i810_dev, I810_MMADDR, 
);
+   if (!intel_private.registers) {
+   pci_read_config_dword(intel_private.pcidev, I810_MMADDR, );
temp &= 0xfff8;
 
-   intel_i810_private.registers = ioremap(temp, 128 * 4096);
-   if (!intel_i810_private.registers) {
+   intel_private.registers = ioremap(temp, 128 * 4096);
+   if (!intel_private.registers) {
printk(KERN_ERR PFX "Unable to remap memory.\n");
return -ENOMEM;
}
}
 
-   if ((readl(intel_i810_private.registers+I810_DRAM_CTL)
+   if ((readl(intel_private.registers+I810_DRAM_CTL)
& I810_DRAM_ROW_0) == I810_DRAM_ROW_0_SDRAM) {
/* This will need to be dynamically assigned */
printk(KERN_INFO PFX "detected 4MB dedicated video ram.\n");
-   intel_i810_private.num_dcache_entries = 1024;
+   intel_private.num_dcache_entries = 1024;
}
-   pci_read_config_dword(intel_i810_private.i810_dev, I810_GMADDR, );
+   pci_read_config_dword(intel_private.pcidev, I810_GMADDR, );
agp_bridge->gart_bus_addr = (temp & PCI_BASE_ADDRESS_MEM_MASK);
-   writel(agp_bridge->gatt_bus_addr | I810_PGETBL_ENABLED, 
intel_i810_private.registers+I810_PGETBL_CTL);
-   readl(intel_i810_private.registers+I810_PGETBL_CTL);/* PCI Posting. 
*/
+   writel(agp_bridge->gatt_bus_addr | I810_PGETBL_ENABLED, 
intel_private.registers+I810_PGETBL_CTL);
+   readl(intel_private.registers+I810_PGETBL_CTL); /* PCI Posting. */
 
if (agp_bridge->driver->needs_scratch_page) {
for (i = 0; i < current_size->num_entries; i++) {
-   writel(agp_bridge->scratch_page, 
intel_i810_private.registers+I810_PTE_BASE+(i*4));
-   
readl(intel_i810_private.registers+I810_PTE_BASE+(i*4));/* PCI posting. 
*/
+   writel(agp_bridge->scratch_page, 
intel_private.registers+I810_PTE_BASE+(i*4));
+   readl(intel_private.registers+I810_PTE_BASE+(i*4)); 
/* PCI posting. */
}
}
global_cache_flush();
@@ -161,9 +168,9 @@ static int intel_i810_configure(void)
 
 static void intel_i810_cleanup(void)
 {
-   writel(0, intel_i810_private.registers+I810_PGETBL_CTL);
-   readl(intel_i810_private.registers);/* PCI Posting. */
-   iounmap(intel_i810_private.registers);
+   writel(0, intel_private.registers+I810_PGETBL_CTL);
+   readl(intel_private.registers); /* PCI Posting. */
+   iounmap(intel_private.registers);
 }
 
 static void intel_i810_tlbflush(struct agp_memory *mem)
@@ -261,9 +268,9 @@ static int intel_i810_insert_entries(struct agp_memory 
*mem, off_t pg_start,
global_cache_flush();
for (i = pg_start; i < (pg_start + mem->page_count); i++) {
writel((i*4096)|I810_PTE_LOCAL|I810_PTE_VALID,
-  
intel_i810_private.registers+I810_PTE_BASE+(i*4));
+  intel_private.registers+I810_PTE_BASE+(i*4));
}
-   readl(intel_i810_private.registers+I810_PTE_BASE+((i-1)*4));
+   readl(intel_private.registers+I810_PTE_BASE+((i-1)*4));
break;
case

[git pull] drm fixes for 2.6.22-rc3

2007-05-29 Thread Dave Airlie



Hi Linus,

Please pull the 'drm-patches' branch from
ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-patches

It contains a fix for a kmalloc 0, along with new pci ids for the radeon rs480
and a spinlock initialiser.

I have stuff that actually fixes up the drm_drawable.c to use an idr but I'll 
hold until the next merge window.

Dave.

 drivers/char/drm/drm_drawable.c |   41 --
 drivers/char/drm/drm_pciids.h   |7 ++
 drivers/char/drm/i915_irq.c |2 +-
 3 files changed, 34 insertions(+), 16 deletions(-)

commit c4814f9001a8dd28e39311a919beac34f778f76d
Author: Michel D??nzer <[EMAIL PROTECTED]>
Date:   Sat May 26 04:37:08 2007 +1000

drm: make sure the drawable code doesn't call malloc(0).

Signed-off-by: Michel D??nzer <[EMAIL PROTECTED]>
Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit 777c7738a598c6e8d4b850181a509757fb79cf36
Author: Dave Airlie <[EMAIL PROTECTED]>
Date:   Sat May 26 04:19:03 2007 +1000

drm/radeon: add more IGP chipset pci ids

Add more IGP chipset PCI IDs

Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

commit a6399bdd492a3289d39e4b79cbe69ad44a054ee3
Author: Thomas Gleixner <[EMAIL PROTECTED]>
Date:   Sat May 26 05:56:14 2007 +1000

drm: Spinlock initializer cleanup

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Dave Airlie <[EMAIL PROTECTED]>

Re: Syslets, Threadlets, generic AIO support, v6

2007-05-29 Thread Dave Jones

On Tue, May 29, 2007 at 04:20:04PM -0700, Ulrich Drepper wrote:
 > -BEGIN PGP SIGNED MESSAGE-
 > Hash: SHA1
 > 
 > Zach Brown wrote:
 > > That todo item
 > > about producing documentation and distro kernels is specifically to bait
 > > Uli into trying to implement posix aio on top of syslets in glibc.
 > 
 > Get DaveJ to pick up the code for Fedora kernels and I'll get to it.

With F7 out the door, I'm looking at getting devel/ back in shape again,
so I can get something done there soon-ish.  With the usual caveat that if
this isn't upstream by the time we do a release, we'll have to drop it
due to the added syscall. (Maybe we can just get that reserved upstream now?)

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: reduce-size-of-task_struct-on-64-bit-machines.patch removed from -mm tree

2007-05-29 Thread Kyle McMartin

On Tue, May 29, 2007 at 05:55:01PM -0400, Kyle McMartin wrote:
> Other 64-bit arch maintainers should check that they're using a 32-bit
> load/store on these fields in their assembly now (we were using a
> #define to do word on 32bit and doubleword on 64bit, which broke, badly,
> when these changed.)
> 

Bummer, I just audited everyone else and we're the only people who use
these flags in assembly. Sigh.

Oh well.

Cheers,
Kyle
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re:swap prefetch improvements

2007-05-29 Thread Con Kolivas

On Wednesday 30 May 2007 05:59, Antonino Ingargiola wrote:
> 2007/5/29, Antonino Ingargiola <[EMAIL PROTECTED]>:
> [cut]
>
> > Swap Prefetch OFF
> > # ./sp_tester
> > Ram 776388000  Swap 51404
> > Total ram to be malloced: 1033408000 bytes
> > Starting first malloc of 516704000 bytes
> > Starting 1st read of first malloc
> > Touching this much ram takes 1642 milliseconds
> > Starting second malloc of 516704000 bytes
> > Completed second malloc and free
> > Sleeping for 60 seconds
> > Important part - starting reread of first malloc
> > Completed read of first malloc
> > Timed portion 9089 milliseconds
> >
> >
> > Swap Prefetch OFF
> > # ./sp_tester
> > Ram 776388000  Swap 51404
> > Total ram to be malloced: 1033408000 bytes
> > Starting first malloc of 516704000 bytes
> > Starting 1st read of first malloc
> > Touching this much ram takes 1635 milliseconds
> > Starting second malloc of 516704000 bytes
> > Completed second malloc and free
> > Sleeping for 60 seconds
> > Important part - starting reread of first malloc
> > Completed read of first malloc
> > Timed portion 1783 milliseconds
>
> The second case is clearly with swap prefetch *ON*, sorry.

Thanks very much for testing! The patch has been taken up by Andrew for the 
next -mm.

どうも
-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/20] Blackfin update for 2.6.22-rc3

2007-05-29 Thread Linus Torvalds

On Wed, 30 May 2007, Bernd Schmidt wrote:
> 
> The binfmt_flat patch also touches other nommu architectures.  Do you
> want these kinds of patches (which aren't just Blackfin-specific)
> separately as they come up?

Let's take it case by case. I really have no problem with architecture- 
specific git trees occasionally touching generic code, but I usually want 
an explanation for _why_ it ends up touching some file that somebody else 
might care about (not so much because I necessarily care deeply, but so 
that I can keep up with changes, so that if somebody then complains, I 
know what they are complaining about!)

For example, what's good to do is to just give me a diffstat of the whole 
thing, and then for generic code, perhaps show the whole diff along with 
the explanation.

And if the diff is big, yes, then we probably need to handle it 
separately, but usually the things tend to be a few lines of "add support 
for this architecture to this piece of code that hasn't been abstracted 
out enough yet".

To make a long story short: no real hard rules. It can be a good idea to 
start out extra careful, and once both sides get more used to each other, 
we can loosen the rules.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] RTC: Use fallback IRQ if PNP tables don't provide one

2007-05-29 Thread Matthew Garrett

On Tue, May 29, 2007 at 05:30:58PM -0700, Andrew Morton wrote:

> Matthew didn't reply to this, almost surely because you removed him
> (and David) from the cc.  Please don't ever do that.

Ah, I saw this on linux-acpi and replied to it there. There's no entries 
in the options file and the resources one lists no IRQ.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Display Intel Dynamic Acceleration feature in /proc/cpuinfo

2007-05-29 Thread Venki Pallipadi

On Thu, May 24, 2007 at 05:04:13PM -0700, H. Peter Anvin wrote:
> 
> If they grow slowly from the bottom, I guess we could simply allocate
> space in the vector byte by byte instead.  Either way, it means more
> work whenever anything has to change.
> 

hpa,

Below patch adds a new word for feature bits that willb eused for all Intel
features that may be spread around in CPUID leafs like 0x6, 0xA, etc.
I added "ida" bit first into this word. I will send an incremental patch
to move ARCH_PERFMON bit and any other feature bits in these leaf subsequently.
The patch is against newsetup git tree.

Please apply.

Thanks,
Venki



Use a new CPU feature word to cover all Intel features that are spread around
in different CPUID leafs like 0x5, 0x6 and 0xA. Make this
feature detection code common across i386 and x86_64.

Display Intel Dynamic Acceleration feature in /proc/cpuinfo. This feature
will be enabled automatically by current acpi-cpufreq driver.

Refer to Intel Software Developer's Manual for more details about the feature.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: linux-2.6/include/asm-i386/cpufeature.h
===
--- linux-2.6.orig/include/asm-i386/cpufeature.h2007-05-29 
07:30:28.0 -0700
+++ linux-2.6/include/asm-i386/cpufeature.h 2007-05-29 10:21:17.0 
-0700
@@ -12,7 +12,7 @@
 #endif
 #include 
 
-#define NCAPINTS   7   /* N 32-bit words worth of info */
+#define NCAPINTS   8   /* N 32-bit words worth of info */
 
 /* Intel-defined CPU features, CPUID level 0x0001 (edx), word 0 */
 #define X86_FEATURE_FPU(0*32+ 0) /* Onboard FPU */
@@ -109,6 +109,9 @@
 #define X86_FEATURE_LAHF_LM(6*32+ 0) /* LAHF/SAHF in long mode */
 #define X86_FEATURE_CMP_LEGACY (6*32+ 1) /* If yes HyperThreading not valid */
 
+/* More extended Intel flags: From various new CPUID levels like 0x6, 0xA etc 
*/
+#define X86_FEATURE_IDA(7*32+ 0) /* Intel Dynamic Acceleration 
*/
+
 #define cpu_has(c, bit)
\
(__builtin_constant_p(bit) &&   \
 ( (((bit)>>5)==0 && (1UL<<((bit)&31) & REQUIRED_MASK0)) || \
@@ -117,7 +120,8 @@
   (((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3)) || \
   (((bit)>>5)==4 && (1UL<<((bit)&31) & REQUIRED_MASK4)) || \
   (((bit)>>5)==5 && (1UL<<((bit)&31) & REQUIRED_MASK5)) || \
-  (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6)) )  \
+  (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6)) || \
+  (((bit)>>5)==7 && (1UL<<((bit)&31) & REQUIRED_MASK7)) )  \
  ? 1 : \
  test_bit(bit, (c)->x86_capability))
 #define boot_cpu_has(bit)  cpu_has(_cpu_data, bit)
Index: linux-2.6/arch/i386/kernel/cpu/proc.c
===
--- linux-2.6.orig/arch/i386/kernel/cpu/proc.c  2007-05-29 07:30:20.0 
-0700
+++ linux-2.6/arch/i386/kernel/cpu/proc.c   2007-05-29 08:20:51.0 
-0700
@@ -65,6 +65,12 @@
"osvw", "ibs", NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+
+   /* Intel-defined (#3) */
+   "ida", NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
};
static const char * const x86_power_flags[] = {
"ts",   /* temperature sensor */
Index: linux-2.6/arch/x86_64/kernel/setup.c
===
--- linux-2.6.orig/arch/x86_64/kernel/setup.c   2007-05-29 07:30:21.0 
-0700
+++ linux-2.6/arch/x86_64/kernel/setup.c2007-05-29 09:20:01.0 
-0700
@@ -699,6 +699,7 @@
/* Cache sizes */
unsigned n;
 
+   init_additional_intel_features(c);
init_intel_cacheinfo(c);
if (c->cpuid_level > 9 ) {
unsigned eax = cpuid_eax(10);
@@ -973,6 +974,12 @@
"osvw", "ibs", NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+
+   /* Intel-defined (#3) */
+   "ida", NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
};
static char *x86_power_flags[] = { 
"ts",   /* temperature sensor */
Index: linux-2.6/include/asm-i386/required-features.h

Re: [patch] CFS scheduler, -v12

2007-05-29 Thread Siddha, Suresh B

On Tue, May 29, 2007 at 04:54:29PM -0700, Peter Williams wrote:
> > I tried with various refresh rates of top too.. Do you see the issue
> > at runlevel 3 too?
> 
> I haven't tried that.
> 
> Do your spinners ever relinquish the CPU voluntarily?

Nope. Simple and plain while(1); 's

I can try 32-bit kernel to check.

thanks,
suresh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] Replacing the /proc//exe symlink code

2007-05-29 Thread Matt Helsley

This patch avoids holding the mmap semaphore while walking VMAs in response to
programs which read or follow the /proc//exe symlink. This also 
allows us
to merge mmu and nommu proc_exe_link() functions. The costs are holding a 
separate
reference to the executable file stored in the task struct and increased code in
fork, exec, and exit paths.

Signed-off-by: Matt Helsley <[EMAIL PROTECTED]>
---

Compiled and passed simple tests for regressions when patched against a 2.6.20
and 2.6.22-rc2-mm1 kernel.

 fs/exec.c |5 +++--
 fs/proc/base.c|   20 
 fs/proc/internal.h|1 -
 fs/proc/task_mmu.c|   34 --
 fs/proc/task_nommu.c  |   34 --
 include/linux/sched.h |1 +
 kernel/exit.c |2 ++
 kernel/fork.c |   10 +-
 8 files changed, 35 insertions(+), 72 deletions(-)

Index: linux-2.6.22-rc2-mm1/include/linux/sched.h
===
--- linux-2.6.22-rc2-mm1.orig/include/linux/sched.h
+++ linux-2.6.22-rc2-mm1/include/linux/sched.h
@@ -988,10 +988,11 @@ struct task_struct {
int oomkilladj; /* OOM kill score adjustment (bit shift). */
char comm[TASK_COMM_LEN]; /* executable name excluding path
 - access with [gs]et_task_comm (which lock
   it with task_lock())
 - initialized normally by flush_old_exec */
+   struct file *exe_file;
 /* file system info */
int link_count, total_link_count;
 #ifdef CONFIG_SYSVIPC
 /* ipc stuff */
struct sysv_sem sysvsem;
Index: linux-2.6.22-rc2-mm1/fs/exec.c
===
--- linux-2.6.22-rc2-mm1.orig/fs/exec.c
+++ linux-2.6.22-rc2-mm1/fs/exec.c
@@ -1106,12 +1106,13 @@ int search_binary_handler(struct linux_b
read_unlock(_lock);
retval = fn(bprm, regs);
if (retval >= 0) {
put_binfmt(fmt);
allow_write_access(bprm->file);
-   if (bprm->file)
-   fput(bprm->file);
+   if (current->exe_file)
+   fput(current->exe_file);
+   current->exe_file = bprm->file;
bprm->file = NULL;
current->did_exec = 1;
proc_exec_connector(current);
return retval;
}
Index: linux-2.6.22-rc2-mm1/fs/proc/base.c
===
--- linux-2.6.22-rc2-mm1.orig/fs/proc/base.c
+++ linux-2.6.22-rc2-mm1/fs/proc/base.c
@@ -951,10 +951,30 @@ const struct file_operations proc_pid_sc
.write  = sched_write,
.llseek = seq_lseek,
.release= seq_release,
 };
 
+static int proc_exe_link(struct inode *inode, struct dentry **dentry,
+struct vfsmount **mnt)
+{
+   int error;
+   struct task_struct *task;
+
+   task = get_proc_task(inode);
+   if (!task)
+   return -ENOENT;
+   error = -ENOSYS;
+   if (!task->exe_file)
+   goto out;
+   *mnt = mntget(task->exe_file->f_path.mnt);
+   *dentry = dget(task->exe_file->f_path.dentry);
+   error = 0;
+out:
+   put_task_struct(task);
+   return error;
+}
+
 static void *proc_pid_follow_link(struct dentry *dentry, struct nameidata *nd)
 {
struct inode *inode = dentry->d_inode;
int error = -EACCES;
 
Index: linux-2.6.22-rc2-mm1/kernel/exit.c
===
--- linux-2.6.22-rc2-mm1.orig/kernel/exit.c
+++ linux-2.6.22-rc2-mm1/kernel/exit.c
@@ -924,10 +924,12 @@ fastcall void do_exit(long code)
if (unlikely(tsk->audit_context))
audit_free(tsk);
 
taskstats_exit(tsk, group_dead);
 
+   if (tsk->exe_file)
+   fput(tsk->exe_file);
exit_mm(tsk);
 
if (group_dead)
acct_process();
exit_sem(tsk);
Index: linux-2.6.22-rc2-mm1/kernel/fork.c
===
--- linux-2.6.22-rc2-mm1.orig/kernel/fork.c
+++ linux-2.6.22-rc2-mm1/kernel/fork.c
@@ -1163,10 +1163,13 @@ static struct task_struct *copy_process(
 
/* ok, now we should be set up.. */
p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & 
CSIGNAL);
p->pdeath_signal = 0;
p->exit_state = 0;
+   p->exe_file = current->exe_file;
+   if (p->exe_file)
+   get_file(p->exe_file);
 
/*
 * Ok, make it visible to the rest of the system.
 * We dont wake it up yet.

Re: [patch 2/2] m68k: Discontinuous memory support

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 21:16:32 +0200
Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:

> -#define __page_address(page) (PAGE_OFFSET + (((page) - mem_map) << 
> PAGE_SHIFT))
> -#define page_to_phys(page)   virt_to_phys((void *)__page_address(page))
> +#ifdef CONFIG_SINGLE_MEMORY_CHUNK
> +#define page_to_phys(page) \
> + __pa(PAGE_OFFSET + (((page) - pg_data_map[0].node_mem_map) << 
> PAGE_SHIFT))
> +#else
> +#define page_to_phys(page) ({
> \
> + struct pglist_data *pgdat;  \
> + pgdat = pg_data_table[page_to_nid(page)];   \
> + page_to_pfn(page) << PAGE_SHIFT;\
> +})
> +#endif

macros which evaluate their args more than once are dangerous.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/2] m68k: Discontinuous memory support

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 21:16:32 +0200
Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:

> + for_each_online_pgdat(pgdat) {
> + for (i = 0; i < pgdat->node_spanned_pages; i++) {
> + struct page *page = pgdat->node_mem_map + i;
> + total++;
> + if (PageReserved(page))
> + reserved++;
> + else if (PageSwapCache(page))
> + cached++;
> + else if (!page_count(page))
> + free++;

This isn't really true.  Callers of the page allocator don't _have_ to use
page_count(): they can internally perform their own refcounting.  One such
caller is slab, so this "free" count can end up being grossly wrong.

> + else
> + shared += page_count(page) - 1;
> + }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Something goes wrong with timer statistics.

2007-05-29 Thread Stephane Casset

Le Tue, May 29, 2007 at 11:38:48PM +0200, Ian Kumlien écrivait :
> Hi, 
> 
> As the daystar sets, i try to play some with my new would be
> firewall/server, but since this will be running for quite some time i
> have been experimenting with powertop to find out what i can do to limit
> it's power usage.
> 
> But, if i run powertop for too long or a few times to many... this
> happens:
> http://pomac.netswarm.net/pics/kernel_panic.jpg
> 
> If i don't run powertop, it is rock solid... Compiling for hours,
> running memtest for hours etc etc... 

Same here, P4 HT, if I run powertop for 5-10 minutes the system crash :(
It happens with 2.6.22-rc{2,3} at least :( I didn't try before.

I don't have a oops since the last crash was in X, but next time I will
try to catch it. If there is any patch around to fix this I am willing
to test.

Regards
-- 
Stéphane CassetLOGIDÉE sàrl   Se faire plaisir d'apprendre
1a, rue PasteurTel : +33 388 23 69 77   [EMAIL PROTECTED]
F-67540 OSTWALDFax : +33 388 23 69 77   http://logidee.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/7] cxgb3 - fix netpoll hanlder

2007-05-29 Thread Jeff Garzik


Jeremy Fitzhardinge wrote:

Jeff Garzik wrote:

+t3_intr_handler(adapter, qs->rspq.polling) (0,
+(adapter->flags & USING_MSIX) ?
+(void *)qs : (void *)adapter);

Remove needless casts to void*

The two branches of ?: need to have the same type; without the casts
they'd be "struct sge_qset" and "struct adapter".  Seems a bit cruddy to
have two types passed to one function depending on the MSI state, but
maybe that's OK.


Look at the function argument...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] m68k: runtime patching infrastructure

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 21:16:31 +0200
Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:

> --- a/include/asm-m68k/module.h
> +++ b/include/asm-m68k/module.h
> @@ -1,7 +1,38 @@
>  #ifndef _ASM_M68K_MODULE_H
>  #define _ASM_M68K_MODULE_H
> -struct mod_arch_specific { };
> +
> +struct mod_arch_specific {
> + struct m68k_fixup_info *fixup_start, *fixup_end;
> +};

Here we use struct m68k_fixup_info.

> +#define MODULE_ARCH_INIT {   \
> + .fixup_start= __start_fixup,\
> + .fixup_end  = __stop_fixup, \
> +}
> +
>  #define Elf_Shdr Elf32_Shdr
>  #define Elf_Sym Elf32_Sym
>  #define Elf_Ehdr Elf32_Ehdr
> +
> +
> +enum m68k_fixup_type {
> + m68k_fixup_memoffset,
> +};
> +
> +struct m68k_fixup_info {
> + enum m68k_fixup_type type;
> + void *addr;
> +};

and later we define it.

How come it doesn't spit warnings?

I think it could be tightened up even if it happens not to warn?

> +#define m68k_fixup(type, addr)   \
> + "   .section \".m68k_fixup\",\"aw\"\n"  \
> + "   .long " #type "," #addr "\n"\
> + "   .previous\n"
> +
> +extern struct m68k_fixup_info __start_fixup[], __stop_fixup[];
> +
> +struct module;
> +extern void module_fixup(struct module *mod, struct m68k_fixup_info *start,
> +  struct m68k_fixup_info *end);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC, PATCH 1/3] introduce SYS_CLONE_MASK

2007-05-29 Thread Albert Cahalan


On 5/29/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Albert Cahalan" <[EMAIL PROTECTED]> writes:
> Jan Engelhardt writes:



-if(self_pid==1 && ADOPTED(processes[i]) && forest_type!='u')
+if(ADOPTED(processes[i]) && forest_type!='u')


That's not compatible because init's children are now in the
logical place. Since the days of procps-1.x.x or earlier,
such processes have been listed at top level.

BTW, what does "ps -ejH" do for you, with and without the patch?


ps -ejH displays everything.


That's not what I mean. (the "-e" causes that of course)
I'm asking about the parent-child relationships shown.
The "-H" option is a bit different from the "f" option.


I'd be a lot happier about breaking compatibility in this area
if I could get a functional adoption flag. That is, I really
would like to show a process as child of init if it naturally
was created as a child of init. It's less informative to have
fake children showing up the same as real ones. The original
parent PID would do. (BTW, the original parent name and/or
grandparent PID would be great to have) As a bonus, the kernel
could reap these processes more quickly than init can... and
then maybe we can stop caring if init is alive.


Having the kernel not reparent user processes to init is an interesting
idea, especially when those processes have not existed.  I'm not
certain that is POSIX complaint and otherwise backwards compatible.


I'm not suggesting that this be visible via POSIX APIs.

It's almost certainly a given that getppid() must return 1, and
probably /proc needs to show this as well. Without question,
any process created by init must be reaped by init.

Processes NOT created by init could be silently reaped by
the kernel. They need to see their own PPID as 1, but there
need not be any parent-child relationship in the kernel data
structures. The kernel can fake the whole thing, which is nice
because then the kernel isn't depending on userspace to
correctly perform the pointless action of playing with zombies.
(might setting the death signal to 0 be useful here?)

For "ps fax" and such, I'd like to distinguish between init's
real and adopted children. Right now the adopted children
look like they were created by init, which is not true. I only
need a simple boolean flag, set upon reparenting, to tell me.
Such a flag may also be useful for optimizing away the whole
wait/waitpid/wait4/waitid/wait3 nonsense when an adopted
child dies.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] RTC: Use fallback IRQ if PNP tables don't provide one

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 18:50:22 + (UTC)
Matthieu CASTET <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> 
> On Mon, 28 May 2007 18:24:18 +0100, Matthew Garrett wrote:
> 
> > From: Matthew Garrett <[EMAIL PROTECTED]>
> > 
> > Intel Macs (and possibly other machines) provide a PNP entry for the
> > RTC, but provide no IRQ. As a result the rtc-cmos driver doesn't allow
> > wakeup alarms. If the RTC is located at the legacy ioport range, assume
> > that it's on IRQ 8 unless the tables say otherwise.
> I post something via gmane this morning, but it seems it was lost :
> 
> Did you check if there aren't multiple configuration for rtc (one with 
> irq, and
> one without it) ?
> 
> What's the ouput of 
> $ for i in /sys/bus/pnp/devices/*; do if [ "$(cat $i/id)" = PNP0b00 ]; 
> then cat
> $i/resources; echo options; cat $i/options; fi; done
> 

Matthew didn't reply to this, almost surely because you removed him
(and David) from the cc.  Please don't ever do that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add procfs tunable to enable immediate panic when there are busy inodes after umount

2007-05-29 Thread David Chinner

On Tue, May 29, 2007 at 11:40:42AM -0400, Jeff Layton wrote:
> After spending quite a bit of time tracking down a "VFS: busy inodes
> after unmount" problem, it occurs to me that it would be nice to be
> able to force a panic when that occurs. While an oops message alone is
> not generally helpful for tracking down this sort of problem,
> collecting and analyzing a coredump when this occurs can be.

Agreed - we've found that we've had roughly 50% success in finding
the cause of these problems from crash dumps triggered immediately
like this vs ~0% from a crash that occurred some time later.

Given that this problem will always result in a crash of the kernel
at some random time in the future, why don't we just make this error
an unconditional panic on get the crash over and done with?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch -mm 1/1] remove useless tolower in isofs

2007-05-29 Thread young dave


Hi,
Thank you, andrew.


Your email client replaces tabs with spaces.


Really? I use gmail web via firefox,  next time I will use mutt to
send patches.

Regards
dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] /proc/*/environ: wrong placing of ptrace_may_attach() check

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 17:41:57 +0400
Alexey Dobriyan <[EMAIL PROTECTED]> wrote:

> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>

Better changelogs, please.

> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -204,12 +204,17 @@ static int proc_pid_environ(struct task_
>   int res = 0;
>   struct mm_struct *mm = get_task_mm(task);
>   if (mm) {
> - unsigned int len = mm->env_end - mm->env_start;
> + unsigned int len;
> +
> + res = -ESRCH;
> + if (!ptrace_may_attach(task))
> + goto out;
> +
> + len  = mm->env_end - mm->env_start;
>   if (len > PAGE_SIZE)
>   len = PAGE_SIZE;
>   res = access_process_vm(task, mm->env_start, buffer, len, 0);
> - if (!ptrace_may_attach(task))
> - res = -ESRCH;
> +out:
>   mmput(mm);
>   }
>   return res;

What's wrong with the existing code?  It's a bit dopey-looking and can, I
guess, permit a task to cause a pagefault in an mm which it doesn't have
permission to read from.  But is there some more serious problem being
fixed here?

I shouldn't have to ask this stuff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ckrm-tech] [RFC] [PATCH 0/3] Add group fairness to CFS

2007-05-29 Thread hui

On Mon, May 28, 2007 at 10:09:19PM +0530, Srivatsa Vaddagiri wrote:
> On Fri, May 25, 2007 at 10:14:58AM -0700, Li, Tong N wrote:
> > is represented by a weight of 10. Inside the group, let's say the two
> > tasks, P1 and P2, have weights 1 and 2. Then the system-wide weight for
> > P1 is 10/3 and the weight for P2 is 20/3. In essence, this flattens
> > weights into one level without changing the shares they represent.
> 
> What do these task weights control? Timeslice primarily? If so, I am not
> sure how well it can co-exist with cfs then (unless you are planning to
> replace cfs with a equally good interactive/fair scheduler :)

It's called SD. From Con Kolivas that got it right the first time around :)

> I would be very interested if this weight calculation can be used for
> smpnice based load balancing purposes too ..

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ckrm-tech] [RFC] [PATCH 0/3] Add group fairness to CFS

2007-05-29 Thread Peter Williams


William Lee Irwin III wrote:

William Lee Irwin III wrote:

Lag should be considered in lieu of load because lag


On Sun, May 27, 2007 at 11:29:51AM +1000, Peter Williams wrote:

What's the definition of lag here?


Lag is the deviation of a task's allocated CPU time from the CPU time
it would be granted by the ideal fair scheduling algorithm (generalized
processor sharing; take the limit of RR with per-task timeslices
proportional to load weight as the scale factor approaches zero).


Over what time period does this operate?


Negative lag reflects receipt of excess CPU time. A close-to-canonical
"fairness metric" is the maximum of the absolute values of the lags of
all the tasks on the system. The "signed minimax pseudonorm" is the
largest lag without taking absolute values; it's a term I devised ad
hoc to describe the proposed algorithm.


So what you're saying is that you think dynamic priority (or its 
equivalent) should be used for load balancing instead of static priority?




William Lee Irwin III wrote:

is what the
scheduler is trying to minimize;


On Sun, May 27, 2007 at 11:29:51AM +1000, Peter Williams wrote:
This isn't always the case.  Some may prefer fairness to minimal lag. 
Others may prefer particular tasks to receive preferential treatment.


This comment does not apply. Generalized processor sharing expresses
preferential treatment via weighting. Various other forms of
preferential treatment require more elaborate idealized models.


This was said before I realized that your "lag" is just a measure of 
fairness.






load is not directly relevant, but
appears to have some sort of relationship. Also, instead of pinned,
unpinned should be considered.


On Sun, May 27, 2007 at 11:29:51AM +1000, Peter Williams wrote:
If you have total and pinned you can get unpinned.  It's probably 
cheaper to maintain data for pinned than unpinned as there's less of it 
on normal systems.


Regardless of the underlying accounting,


I was just replying to your criticism of my suggestion to keep pinned 
task statistics and use them.



I've presented a coherent
algorithm. It may be that there's no demonstrable problem to solve.
On the other hand, if there really is a question as to how to load
balance in the presence of tasks pinned to cpus, I just answered it.


Unless I missed something there's nothing in your suggestion that does 
anything more about handling pinned tasks than is already done by the 
load balancer.





William Lee Irwin III wrote:

Using the signed minimax pseudonorm (i.e. the highest
signed lag, where positive is higher than all negative regardless of
magnitude) on unpinned lags yields a rather natural load balancing
algorithm consisting of migrating from highest to lowest signed lag,
with progressively longer periods for periodic balancing across
progressively higher levels of hierarchy in sched_domains etc. as usual.
Basically skip over pinned tasks as far as lag goes.
The trick with all that comes when tasks are pinned within a set of
cpus (especially crossing sched_domains) instead of to a single cpu.


On Sun, May 27, 2007 at 11:29:51AM +1000, Peter Williams wrote:
Yes, this makes the cost of maintaining the required data higher which 
makes keeping pinned data more attractive than unpinned.
BTW keeping data for sets of CPU affinities could cause problems as the 
number of possible sets is quite large (being 2 to the power of the 
number of CPUs).  So you need an algorithm based on pinned data for 
single CPUs that knows the pinning isn't necessarily exclusive rather 
than one based on sets of CPUs.  As I understand it (which may be 
wrong), the mechanism you describe below takes that approach.


Yes, the mechanism I described takes that approach.


William Lee Irwin III wrote:

The smpnice affair is better phrased in terms of task weighting. It's
simple to honor nice in such an arrangement. First unravel the
grouping hierarchy, then weight by nice. This looks like

[...]

In such a manner nice numbers obey the principle of least surprise.


On Sun, May 27, 2007 at 11:29:51AM +1000, Peter Williams wrote:
Is it just me or did you stray from the topic of handling cpu affinity 
during load balancing to hierarchical load balancing?  I couldn't see 
anything in the above explanation that would improve the handling of cpu 
affinity.


There was a second issue raised to which I responded. I didn't stray
per se. I addressed a second topic in the post.


OK.

To reiterate, I don't think that my suggestion is really necessary.  I 
think that the current load balancing (stand fast a small bug that's 
being investigated) will come up with a good distribution of tasks to 
CPUs within the constraints imposed by any CPU affinity settings.


Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a

Re: [PATCH 8/8] Char: vt_ioctl, use wait_event_interruptible

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 15:31:12 +0200 (CEST)
Jiri Slaby <[EMAIL PROTECTED]> wrote:

> vt_ioctl, use wait_event_interruptible
> 
> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>
> 
> ---
> commit fbe1931e02f11b2fef771ff1698f1598b3567520
> tree fcfcd72a5619f6e26598ac2ee3132aed4b070987
> parent c025c4b3eca99f50b05bc24c445b861e91226539
> author Jiri Slaby <[EMAIL PROTECTED]> Sat, 26 May 2007 22:58:17 +0200
> committer Jiri Slaby <[EMAIL PROTECTED]> Sat, 26 May 2007 22:58:17 +0200
> 
>  drivers/char/vt_ioctl.c |   22 --
>  1 files changed, 4 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/char/vt_ioctl.c b/drivers/char/vt_ioctl.c
> index c6f6f42..2056367 100644
> --- a/drivers/char/vt_ioctl.c
> +++ b/drivers/char/vt_ioctl.c
> @@ -1035,12 +1035,8 @@ static DECLARE_WAIT_QUEUE_HEAD(vt_activate_queue);
>  int vt_waitactive(int vt)
>  {
>   int retval;
> - DECLARE_WAITQUEUE(wait, current);
> -
> - add_wait_queue(_activate_queue, );
> - for (;;) {
> - retval = 0;
>  
> + return wait_event_interruptible(vt_activate_queue, ({
>   /*
>* Synchronize with redraw_screen(). By acquiring the console
>* semaphore we make sure that the console switch is completed
> @@ -1049,20 +1045,10 @@ int vt_waitactive(int vt)
>* updated, but the console switch hasn't been completed.
>*/
>   acquire_console_sem();
> - set_current_state(TASK_INTERRUPTIBLE);
> - if (vt == fg_console) {
> - release_console_sem();
> - break;
> - }
> + retval = vt == fg_console;
>   release_console_sem();
> - retval = -EINTR;
> - if (signal_pending(current))
> - break;
> - schedule();
> - }
> - remove_wait_queue(_activate_queue, );
> - __set_current_state(TASK_RUNNING);
> - return retval;
> + retval;
> + }));
>  }
>  

So we end up with

int vt_waitactive(int vt)
{
int retval;

return wait_event_interruptible(vt_activate_queue, ({
/*
 * Synchronize with redraw_screen(). By acquiring the console
 * semaphore we make sure that the console switch is completed
 * before we return. If we didn't wait for the semaphore, we
 * could return at a point where fg_console has already been
 * updated, but the console switch hasn't been completed.
 */
acquire_console_sem();
retval = vt == fg_console;
release_console_sem();
retval;
}));
}

Again, I do think this needs a helper function.  Or something.  The handling
of `retval' in there is pretty perverse.  We're modifying a local variable
within the macro and then returning it by value?  Perhaps a bit cleaner
would be to move `retval' inside the macro body.

But a helper function would be better.  Again, remember that the macro
evaluates that expression twice - you may find that the helper function
even generates less code.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: joydev.c and saitek cyborg evo force

2007-05-29 Thread Jiri Kosina

On Wed, 30 May 2007, Renato Golin wrote:

> The HID sources are quite different from 2.6.21 and 2.6.20 but I don't 
> know how much was because Canonical guys and how much it really changed. 
> :( I will eventually put a Gentoo on my old laptop and try it for real, 
> sorry I couldn't be of much help now...

Hi Renato,

well I have changed the overall HID code design in 2.6.21 a little bit. 
Anyway, just hardcoding '#define DEBUG' and '#define DEBUG_DATA' (that's 
also important) in 2.6.20-and-older kernels should have similar result as 
CONFIG_HID_DEBUG in post-2.6.21.

In your particular case, the DEBUG_DATA might really be more interesting.

Thanks,

-- 
Jiri Kosina
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-usb-devel] Dealing with flaky USB storage devices and rootfs

2007-05-29 Thread Dan Aloni

On Tue, May 29, 2007 at 05:50:49PM -0400, Alan Stern wrote:
> On Tue, 29 May 2007, Dan Aloni wrote:
> 
> > Hello,
> > 
> > We have a system where the rootfs is a partition on a USB device,
> > and I've noticed upon a few rare cases where the USB controller 
> > loses the connection to the USB device after some uptime (days,
> > weeks...), and the USB device reappears a very short time later.
> 
> That failure mode is pretty uncommon.  More often what happens is the
> connection remains intact but communication/protocol/firmware/???  
> errors cause the device to stop working.  It never disappears but it
> can't be used again without unplugging or power-cycling.

Yes this is also what I assume happening. i.e. more likely a USB 
flash disk firmware bug than a controller bug (there are lots of 
crappy USB flash drives out there).
 
> > So, I gave some thoughts about this, I and came up with two 
> > main solutions:
> > 
> > 1) Improve the USB storage error handling - bind the already 
> > existing SCSI host to the USB port that has the device, e.g., 
> > if host2 got created for usb 5-3 then keep it that way for the 
> > sake of EH. /dev/sda1 should come to life when the USB device 
> > recovers, unless a few seconds have passed or some attributes 
> > (such as manufactor id or serial) have changed.
> 
> The difficulty here is that this "error" is indistinguishable from
> normal activity -- someone simply unplugs the device and then later on
> another connection is made.  It might be the same device as before or
> it might be a different one.  In other words, it isn't really an error.  
> You would solve this by relying on "a few seconds" timeout.
[...]
> 
> It also goes against the USB specification.  And it is potentially 
> unsafe, in that it is possible for users to change media or make other 
> alterations that the kernel cannot detect.  The same would be true of 
> your proposal, assuming that somebody was quick enough to unplug one 
> device and plug in another (or swap memory cards) in the span of a few 
> seconds.

The specific use case I refer to is with a flash drive embedded 
inside a locked and closed chassis of a dedicated server. So, 
anyone repluging it must know what they are doing anyway.

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Something goes wrong with timer statistics.

2007-05-29 Thread Ian Kumlien

On tis, 2007-05-29 at 16:03 -0700, David Miller wrote:
> From: Ian Kumlien <[EMAIL PROTECTED]>
> Date: Wed, 30 May 2007 00:51:52 +0200
> 
> > On tis, 2007-05-29 at 15:44 -0700, David Miller wrote:
> > > From: "Michal Piotrowski" <[EMAIL PROTECTED]>
> > > Date: Wed, 30 May 2007 00:41:46 +0200
> > > 
> > > > On 30/05/07, David Miller <[EMAIL PROTECTED]> wrote:
> > > > > From: Ian Kumlien <[EMAIL PROTECTED]>
> > > > > Date: Tue, 29 May 2007 23:38:48 +0200
> > > > >
> > > > > > As the daystar sets, i try to play some with my new would be
> > > > > > firewall/server, but since this will be running for quite some time 
> > > > > > i
> > > > > > have been experimenting with powertop to find out what i can do to 
> > > > > > limit
> > > > > > it's power usage.
> > > > > >
> > > > > > But, if i run powertop for too long or a few times to many... this
> > > > > > happens:
> > > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > > > >
> > > > > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > > > > running memtest for hours etc etc...
> > > > >
> > > > > I see this same exact problem on sparc64.
> > > > 
> > > > Have you tried this patch?
> > > > http://lkml.org/lkml/2007/5/29/392
> > > 
> > > Of course, I worked with Thomas on the fix :-)
> > 
> > I dunno if this applies to me, i run a 32 bit userspace...
> 
> That patch fixes a problem on 64-bit kernels, regardless of
> userspace, when NOHZ is enabled.

Sorry, i run all 32 bit, i was on the phone while typing that, so a wee
bit distracted. Anyways, i run a nohz 32 bit kernel.

I dunno how good a intel T7200 is when it comes to 64 bit
kernel/userspace.. (Haven't really read up on it)

-- 
Ian Kumlien  -- http://pomac.netswarm.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-29 Thread david


On Wed, 30 May 2007, David Chinner wrote:


On Tue, May 29, 2007 at 04:03:43PM -0400, Phillip Susi wrote:

David Chinner wrote:

The use of barriers in XFS assumes the commit write to be on stable
storage before it returns.  One of the ordering guarantees that we
need is that the transaction (commit write) is on disk before the
metadata block containing the change in the transaction is written
to disk and the current barrier behaviour gives us that.


Barrier != synchronous write,


Of course. FYI, XFS only issues barriers on *async* writes.

But barrier semantics - as far as they've been described by everyone
but you indicate that the barrier write is guaranteed to be on stable
storage when it returns.


this doesn't match what I have seen

wtih barriers it's perfectly legal to have the following sequence of 
events


1. app writes block 10 to OS
2. app writes block 4 to OS
3. app writes barrier to OS
4. app writes block 5 to OS
5. app writes block 20 to OS
6. OS writes block 4 to disk drive
7. OS writes block 10 to disk drive
8. OS writes barrier to disk drive
9. OS writes block 5 to disk drive
10. OS writes block 20 to disk drive
11. disk drive writes block 10 to platter
12. disk drive writes block 4 to platter
13. disk drive writes block 20 to platter
14. disk drive writes block 5 to platter

there is nothing that says that when the app finishes step #3 that the OS 
has even sent the data to the drive, let alone that the drive has flushed 
it to a platter


if the disk drive doesn't support barriers then step #8 becomes 'issue 
flush' and steps 11 and 12 take place before step #9, 13, 14


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/8] Char: rtc, use wait_event_interruptible

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 15:29:50 +0200 (CEST)
Jiri Slaby <[EMAIL PROTECTED]> wrote:

> rtc, use wait_event_interruptible
> 
> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>
> 
> ---
> commit 62bd843054a7b2b2a3e3003a6b57b6359f199837
> tree 7c6912d4ed42581ca46d384ab2467bfb9d9d9dbe
> parent 2b7813d0f34703e0a4b18593a3186bdc7e719c06
> author Jiri Slaby <[EMAIL PROTECTED]> Sat, 26 May 2007 21:58:54 +0200
> committer Jiri Slaby <[EMAIL PROTECTED]> Sat, 26 May 2007 21:58:54 +0200
> 
>  drivers/char/rtc.c |   40 +++-
>  1 files changed, 15 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/char/rtc.c b/drivers/char/rtc.c
> index 20380a2..787b520 100644
> --- a/drivers/char/rtc.c
> +++ b/drivers/char/rtc.c
> @@ -353,33 +353,26 @@ static ssize_t rtc_read(struct file *file, char __user 
> *buf,
>   if (count != sizeof(unsigned int) && count !=  sizeof(unsigned long))
>   return -EINVAL;
>  
> - add_wait_queue(_wait, );
> -
> - do {
> - /* First make it right. Then make it fast. Putting this whole
> -  * block within the parentheses of a while would be too
> -  * confusing. And no, xchg() is not the answer. */
> -
> - __set_current_state(TASK_INTERRUPTIBLE);
> - 
> - spin_lock_irq (_lock);
> - data = rtc_irq_data;
> - rtc_irq_data = 0;
> - spin_unlock_irq (_lock);
> -
> - if (data != 0)
> - break;
> + spin_lock_irq (_lock);
> + data = rtc_irq_data;
> + rtc_irq_data = 0;
> + spin_unlock_irq (_lock);
>  
> + if (data == 0) {
>   if (file->f_flags & O_NONBLOCK) {
>   retval = -EAGAIN;
>   goto out;
>   }
> - if (signal_pending(current)) {
> - retval = -ERESTARTSYS;
> + retval = wait_event_interruptible(rtc_wait, ({
> + spin_lock_irq (_lock);
> + data = rtc_irq_data;
> + rtc_irq_data = 0;
> + spin_unlock_irq (_lock);
> + data;
> + }));
> + if (retval)
>   goto out;
> - }
> - schedule();
> - } while (1);
> + }
>  
>   if (count == sizeof(unsigned int))
>   retval = put_user(data, (unsigned int __user *)buf) ?: 
> sizeof(int);
> @@ -387,10 +380,7 @@ static ssize_t rtc_read(struct file *file, char __user 
> *buf,
>   retval = put_user(data, (unsigned long __user *)buf) ?: 
> sizeof(long);
>   if (!retval)
>   retval = count;
> - out:
> - __set_current_state(TASK_RUNNING);
> - remove_wait_queue(_wait, );
> -
> +out:
>   return retval;
>  #endif

erm, I'm not sure that really improved the code.

Note that wait_event_interruptible() evaluates the condition twice, so
that's now three copies we have in there.  A little helper function would
make this all much better.

Also, feel free to sneak in a s/spin_lock_irq (/spin_lock_irq(/ cleanup
when making changes like this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] allow console unregistration

2007-05-29 Thread Antonino A. Daplas

On Thu, 2007-05-17 at 15:32 -0700, Jesse Barnes wrote:
> Randy just informed me that the patch limits are bigger now, so here are the
> actual patches.
> 
> This patch allows for proper console unregistration via the VT layer, and
> updates the FB layer to use it.  This makes debugging new console drivers
> much easier, since you can properly clean them up before unloading.
> Antonio already checked it out (and suggested a tweak for the fbcon side)
> so I think it's on its way already via the FB tree.
> 

Jesse,

I already implemented (and tested) selective framebuffer/console
unregistration in my tree using your patch as a base. I'll send this
patch to akpm soon.

Geert,

Is this something that you might need for ps3fb? It should address 2 of
your concerns. The only thing is that you will need
CONFIG_VT_HW_CONSOLE_BINDING=y.  You can default this to 'y' in your
platform, or select it.

Tony

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/7] cxgb3 - fix netpoll hanlder

2007-05-29 Thread Jeremy Fitzhardinge

Jeff Garzik wrote:
>>
>> +t3_intr_handler(adapter, qs->rspq.polling) (0,
>> +(adapter->flags & USING_MSIX) ?
>> +(void *)qs : (void *)adapter);
>
> Remove needless casts to void*
The two branches of ?: need to have the same type; without the casts
they'd be "struct sge_qset" and "struct adapter".  Seems a bit cruddy to
have two types passed to one function depending on the MSI state, but
maybe that's OK.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: joydev.c and saitek cyborg evo force

2007-05-29 Thread Renato Golin


On 21/05/07, Jiri Kosina <[EMAIL PROTECTED]> wrote:

could you please turn on the HID debugging support ("Device Drivers -> HID
devices -> HID debugging support" in menuconfig of any reasonably recent
kernel) and show the output that appears when the joystick is plugged in,
and also when you generate the events that are messed up? This would
hopefully avoid any confusion regarding what is really going on and we'll
see what we can do with it.


Hi Jiri,

Couldn't make the generic kernel work on Ubuntu, it's quite a mess.
The distro kernel have USB general debugging instead of HID_DEBUG but
I had to manually redefine DEBUG and DEBUG_DATA on all hid*.c sources
and recompile the modules.

The HID sources are quite different from 2.6.21 and 2.6.20 but I don't
know how much was because Canonical guys and how much it really
changed. :( I will eventually put a Gentoo on my old laptop and try it
for real, sorry I couldn't be of much help now...

The only additional thing I got from debugging hid.ko and usbhid.ko
was right after detecting the mouse so I guess it didn't help at all.

--renato



May 30 00:40:06 jobim kernel: [ 7151.757499] usbcore: deregistering
interface driver usbhid
May 30 00:40:06 jobim kernel: [ 7151.769001] usbcore: deregistering
interface driver hiddev
May 30 00:40:06 jobim kernel: [ 7151.786229] usbcore: registered new
interface driver hiddev
May 30 00:40:06 jobim kernel: [ 7151.792457] input: Dell Dell USB
Mouse as /class/input/input56
May 30 00:40:06 jobim kernel: [ 7151.792816] input: USB HID v1.10
Mouse [Dell Dell USB Mouse] on usb-:00:0b.0-3
May 30 00:40:06 jobim kernel: [ 7151.807413] input: Saitek Cyborg Evo
Force as /class/input/input57
May 30 00:40:06 jobim kernel: [ 7151.807766] input: USB HID v1.00
Joystick [Saitek Cyborg Evo Force] on usb-:00:0b.0-7
May 30 00:40:06 jobim kernel: [ 7151.808122] usbcore: registered new
interface driver usbhid
May 30 00:40:06 jobim kernel: [ 7151.808328]
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
May 30 00:41:54 jobim kernel: [ 7260.252015] usbcore: deregistering
interface driver usbhid
May 30 00:41:54 jobim kernel: [ 7260.258180] usbcore: deregistering
interface driver hiddev
May 30 00:41:54 jobim kernel: [ 7260.276444] usbcore: registered new
interface driver hiddev
May 30 00:41:54 jobim kernel: [ 7260.293406] input: Dell Dell USB
Mouse as /class/input/input58
May 30 00:41:54 jobim kernel: [ 7260.293709] input: USB HID v1.10
Mouse [Dell Dell USB Mouse] on usb-:00:0b.0-3
May 30 00:41:54 jobim kernel: 09 02 26 2b 01 45 00 95 02 91 02 c0 05
0f 09 a7 27 fe ff 00 00 47 fe ff 00 00 95 01 55 fd 66 01 10 91 02 55
00 65 00 c0 09 5a a1 02 85 0c 09 23 26 2b 01 45 00 91 02 09 5c 26 10
27 46 10 27 55 fd 66 01 10 91 02 55 00 65 00 09 5b 25 7f 75 08 91 02
09 5e 26 10 27 75 10 55 fd 66 01 10 91 02 55 00 65 00 09 5d 25 7f 75
08 91 02 c0 09 73 a1 02 85 0d 09 23 26 2b 01 45 00 75 10 91 02 09 70
15 81 25 7f 36 f0 d8 46 10 27 75 08 91 02 c0 09 6e a1 02 85 0e 09 23
15 00 26 2b 01 35 00 45 00 75 10 91 02 09 70 25 7f 46 10 27 75 08 91
02 09 6f 15 81 36 f0 d8 91 02 09 71 15 00 26 ff 00 35 00 46 68 01 91
02 09 72 26 10 27 46 10 27 75 10 55 fd 66 01 10 91 02 55 00 65 00 c0
09 5f a1 02 85 0f 09 23 26 2b 01 45 00 91 02 09 61 15 9c 25 64 36 f0
d8 46 10 27 75 08 91 02 09 62 91 02 09 60 16 0c fe 26 f4 01 75 10 91
02 09 65 15 00 26 e8 03 35 00 91 02 09 63 25 64 75 08 91 02 09 64 91
02 c0 09 77 a1 02 85 51 09 22 25 09 45 00 91 02 09 78 a1 02 09 7b 09
79 09 7a 15 01 25 03 91 00 c0 09 7c 15 00 26 fe 00 91 02 c0
May 30 00:41:54 jobim kernel: 92 a1 02 85 52 09 96 a1 02 09 9a 09 99
09 97 09 98 09 9b 09 9c 15 01 25 06 91 00 c0 c0 05 ff 0a 01 03 a1 02
85 40 0a 02 03 a1 02 1a 11 03 2a 20 03 25 10 91 00 c0 0a 03 03 15 00
27 ff ff 00 00 75 10 91 02 c0 05 0f 09 7d a1 02 85 43 09 7e 26 80 00
46 10 27 75 08 91 02 c0 09 85 a1 02 85 44 09 86 27 ff ff 00 00 45 00
75 10 91 02 09 87 91 02 09 88 91 02 c0 05 ff 0a 00 01 a1 02 85 81 05
01 09 30 15 81 25 7f 36 f0 d8 46 10 27 75 08 91 02 09 31 91 02 c0 05
0f 09 7f a1 02 85 0b 09 80 15 00 26 ff 7f 35 00 45 00 75 0f b1 03 09
a9 25 01 75 01 b1 03 09 83 26 ff 00 75 08 b1 03 09 84 25 10 b1 03 09
a8 a1 02 09 73 09 6e 09 5a 09 5f 95 04 b1 03 c0 c0 c0
May 30 00:41:54 jobim kernel: [ 7260.320561] input: Saitek Cyborg Evo
Force as /class/input/input59
May 30 00:41:54 jobim kernel: [ 7260.320633] input: USB HID v1.00
Joystick [Saitek Cyborg Evo Force] on usb-:00:0b.0-7
May 30 00:41:54 jobim kernel: [ 7260.320654] usbcore: registered new
interface driver usbhid
May 30 00:41:54 jobim kernel: [ 7260.320660]
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v12

2007-05-29 Thread Peter Williams


Siddha, Suresh B wrote:

On Thu, May 24, 2007 at 04:23:19PM -0700, Peter Williams wrote:

Siddha, Suresh B wrote:

On Thu, May 24, 2007 at 12:43:58AM -0700, Peter Williams wrote:

Further testing indicates that CONFIG_SCHED_MC is not implicated and
it's CONFIG_SCHED_SMT that's causing the problem.  This rules out the
code in find_busiest_group() as it is common to both macros.

I think this makes the scheduling domain parameter values the most
likely cause of the problem.  I'm not very familiar with this code so
I've added those who've modified this code in the last year or
so to the
address of this e-mail.

What platform is this? I remember you mentioned its a 2 cpu box. Is it
dual core or dual package or one with HT?

It's a single CPU HT box i.e. 2 virtual CPUs.  "cat /proc/cpuinfo"
produces:


Peter, I tried on a similar box and couldn't reproduce this problem
with x86_64


Mine's a 32 bit machine.


2.6.22-rc3 kernel


I haven't tried rc3 yet.


and using defconfig(has SCHED_SMT turned on).
I am using top and just the spinners.  I don't have gkrellm running, is that
required to reproduce the issue?


Not necessarily.  But you may need to do a number of trials as sheer 
chance plays a part.




I tried number of times and also in runlevels 3,5(with top running
in a xterm incase of runlevel 5).


I've always done it in run level 5 using gnome-terminal.  I use 10 
consecutive trials without seeing the problem as an indication of its 
absence but will cut that short if I see a 3/1 which quickly recovers 
(see below).




In runlevel 5, occasionally for one refresh screen of top, I see three
spinners on one cpu and one spinner on other(with X or someother app
also on the cpu with one spinner). But it balances nicely for the
immd next refresh of the top screen.


Yes, that (the fact that it recovers quickly) confirms that the problem 
isn't present for your system.  If load balancing occurs when other 
tasks than the spinners are actually running a 1/3 split for the 
spinners is a reasonable outcome so seeing the occasional 1/3 split is 
OK but it should return to 2/2 as soon as the other tasks sleep.


When I'm doing my tests (for the various combinations of macros) I 
always count a case where I see a 3/1 split that quickly recovers as 
proof that this problem isn't present for that case and cease testing.




I tried with various refresh rates of top too.. Do you see the issue
at runlevel 3 too?


I haven't tried that.

Do your spinners ever relinquish the CPU voluntarily?

Peter
--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/20] Blackfin update for 2.6.22-rc3

2007-05-29 Thread Bernd Schmidt

Linus Torvalds wrote:
> 
> On Mon, 28 May 2007, Bryan Wu wrote:
>>  - Blackfin arch update including BF54x initial supporting
>>  - Blackfin driver update: serial/spi/rtc
>>  - Provide new Blackfin watchdog driver
>>  - binfmt_flat.c for Blackfin arch modification
> 
> I realize that this all just touches blackfin-specific stuff, but after 
> -rc3 I really prefer not to bother with these things..
> 
> Also, for stuff that is really just an architecture that I can't even 
> test, and where there is a clear maintainership thing, I'd actually prefer 
> to just do a git merge, if possible. It's not like I will likely start 
> looking at some blackfin-specific patches. Judging from the diffs, you do 
> actually use git, do you have a place where you could export these kinds 
> of patch-series as a git tree instead?

The binfmt_flat patch also touches other nommu architectures.  Do you
want these kinds of patches (which aren't just Blackfin-specific)
separately as they come up?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH  Wilhelm-Wagenfeld-Str. 6  80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Syslets, Threadlets, generic AIO support, v6

2007-05-29 Thread Ulrich Drepper

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Zach Brown wrote:
> That todo item
> about producing documentation and distro kernels is specifically to bait
> Uli into trying to implement posix aio on top of syslets in glibc.

Get DaveJ to pick up the code for Fedora kernels and I'll get to it.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGXLUk2ijCOnn/RHQRAjL0AJ0UQzNnMn8xpj7ga0OeEWUhnkhZfgCfTH+j
iQ52SLZgWwp4wmAGCy/eLZs=
=hpyn
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-29 Thread David Chinner

On Tue, May 29, 2007 at 04:03:43PM -0400, Phillip Susi wrote:
> David Chinner wrote:
> >The use of barriers in XFS assumes the commit write to be on stable
> >storage before it returns.  One of the ordering guarantees that we
> >need is that the transaction (commit write) is on disk before the
> >metadata block containing the change in the transaction is written
> >to disk and the current barrier behaviour gives us that.
> 
> Barrier != synchronous write,

Of course. FYI, XFS only issues barriers on *async* writes.

But barrier semantics - as far as they've been described by everyone
but you indicate that the barrier write is guaranteed to be on stable
storage when it returns.

> so if XFS relies on that block being on 
> the media when the request is completed, then it is broken.

XFS relies on the block being stable before any other write
goes to disk. That is the semantic that the barrier I/Os currently
have. How that is implemented in the device is irrelevant to me,
but if I issue a barrier I/O, I do not expect *any* I/O to be
reordered around it.

> It should 
> only care that the ordering of log-data-log is maintained, not exactly 
> when each specific request completes.

Yes, and that is provided to XFS by the fact that barrier I/Os are
full barriers

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Documentation: How to use GDB to decode OOPSes

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 10:46:18 +0300 (EEST)
Pekka J Enberg <[EMAIL PROTECTED]> wrote:

> +In addition, you can use GDB to figure out the exact file and line
> +number of the OOPS from the vmlinux file. If you have
> +CONFIG_DEBUG_INFO enabled, you can simply copy the EIP value from the
> +OOPS:
> +
> + EIP:0060:[]Not tainted VLI
> +
> +And use GDB to translate that to human-readable form:
> +
> +  gdb vmlinux
> +  (gdb) l *0xc021e50e
> +
> +If you don't have CONFIG_DEBUG_INFO enabled, you use the function
> +offset from the OOPS:
> +
> + EIP is at vt_ioctl+0xda8/0x1482
> +
> +And recompile the kernel with CONFIG_DEBUG_INFO enabled:
> +
> +  make vmlinux
> +  gdb vmlinux
> +  (gdb) p vt_ioctl
> +  (gdb) l *(0x + 0xda8)
> +

yeah.  Often this process will tell you that the oops was in
spin_lock_irq() or list_add() or something useless like that.

So the next step is to start adding and subtracting 4, 8, 12, ...
to the EIP value until you "fall out" of the inlined function and 
back into the callee.

But I'm not sure that I'd want to have to describe that process
(especially the means by which one determines that it is necessary)
to normal people ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[alpha] support new syscalls

2007-05-29 Thread Richard Henderson

Some of the new syscalls require supporting TIF_RESTORE_SIGMASK.


r~


diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/entry.S 
linux-2.6.22-rc2/arch/alpha/kernel/entry.S
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/entry.S 
2007-04-25 20:08:32.0 -0700
+++ linux-2.6.22-rc2/arch/alpha/kernel/entry.S  2007-05-25 17:41:50.0 
-0700
@@ -391,11 +391,10 @@ $work_resched:
bne $2, $work_resched
 
 $work_notifysig:
-   mov $sp, $17
+   mov $sp, $16
br  $1, do_switch_stack
-   mov $5, $21
-   mov $sp, $18
-   mov $31, $16
+   mov $sp, $17
+   mov $5, $18
jsr $26, do_notify_resume
bsr $1, undo_switch_stack
br  restore_all
diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/signal.c 
linux-2.6.22-rc2/arch/alpha/kernel/signal.c
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/signal.c
2007-05-28 15:23:20.0 -0700
+++ linux-2.6.22-rc2/arch/alpha/kernel/signal.c 2007-05-25 17:59:31.0 
-0700
@@ -32,8 +32,8 @@
 #define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
 
 asmlinkage void ret_from_sys_call(void);
-static int do_signal(sigset_t *, struct pt_regs *, struct switch_stack *,
-unsigned long, unsigned long);
+static void do_signal(struct pt_regs *, struct switch_stack *,
+ unsigned long, unsigned long);
 
 
 /*
@@ -146,11 +146,9 @@ sys_rt_sigaction(int sig, const struct s
 asmlinkage int
 do_sigsuspend(old_sigset_t mask, struct pt_regs *regs, struct switch_stack *sw)
 {
-   sigset_t oldset;
-
mask &= _BLOCKABLE;
spin_lock_irq(>sighand->siglock);
-   oldset = current->blocked;
+   current->saved_sigmask = current->blocked;
siginitset(>blocked, mask);
recalc_sigpending();
spin_unlock_irq(>sighand->siglock);
@@ -160,19 +158,17 @@ do_sigsuspend(old_sigset_t mask, struct 
regs->r0 = EINTR;
regs->r19 = 1;
 
-   while (1) {
-   current->state = TASK_INTERRUPTIBLE;
-   schedule();
-   if (do_signal(, regs, sw, 0, 0))
-   return -EINTR;
-   }
+   current->state = TASK_INTERRUPTIBLE;
+   schedule();
+   set_thread_flag(TIF_RESTORE_SIGMASK);
+   return -ERESTARTNOHAND;
 }
 
 asmlinkage int
 do_rt_sigsuspend(sigset_t __user *uset, size_t sigsetsize,
 struct pt_regs *regs, struct switch_stack *sw)
 {
-   sigset_t oldset, set;
+   sigset_t set;
 
/* XXX: Don't preclude handling different sized sigset_t's.  */
if (sigsetsize != sizeof(sigset_t))
@@ -182,7 +178,7 @@ do_rt_sigsuspend(sigset_t __user *uset, 
 
sigdelsetmask(, ~_BLOCKABLE);
spin_lock_irq(>sighand->siglock);
-   oldset = current->blocked;
+   current->saved_sigmask = current->blocked;
current->blocked = set;
recalc_sigpending();
spin_unlock_irq(>sighand->siglock);
@@ -192,12 +188,10 @@ do_rt_sigsuspend(sigset_t __user *uset, 
regs->r0 = EINTR;
regs->r19 = 1;
 
-   while (1) {
-   current->state = TASK_INTERRUPTIBLE;
-   schedule();
-   if (do_signal(, regs, sw, 0, 0))
-   return -EINTR;
-   }
+   current->state = TASK_INTERRUPTIBLE;
+   schedule();
+   set_thread_flag(TIF_RESTORE_SIGMASK);
+   return -ERESTARTNOHAND;
 }
 
 asmlinkage int
@@ -436,7 +430,7 @@ setup_sigcontext(struct sigcontext __use
return err;
 }
 
-static void
+static int
 setup_frame(int sig, struct k_sigaction *ka, sigset_t *set,
struct pt_regs *regs, struct switch_stack * sw)
 {
@@ -481,13 +475,14 @@ setup_frame(int sig, struct k_sigaction 
current->comm, current->pid, frame, regs->pc, regs->r26);
 #endif
 
-   return;
+   return 0;
 
 give_sigsegv:
force_sigsegv(sig, current);
+   return -EFAULT;
 }
 
-static void
+static int
 setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
   sigset_t *set, struct pt_regs *regs, struct switch_stack * sw)
 {
@@ -543,34 +538,38 @@ setup_rt_frame(int sig, struct k_sigacti
current->comm, current->pid, frame, regs->pc, regs->r26);
 #endif
 
-   return;
+   return 0;
 
 give_sigsegv:
force_sigsegv(sig, current);
+   return -EFAULT;
 }
 
 
 /*
  * OK, we're invoking a handler.
  */
-static inline void
+static inline int
 handle_signal(int sig, struct k_sigaction *ka, siginfo_t *info,
  sigset_t *oldset, struct pt_regs * regs, struct switch_stack *sw)
 {
+   int ret;
+
if (ka->sa.sa_flags & SA_SIGINFO)
-   setup_rt_frame(sig, ka, info, oldset, regs, sw);
+   ret = setup_rt_frame(sig, ka, info, oldset, regs, sw);
else
-   setup_frame(sig, ka, oldset, regs, sw);
+   ret = setup_frame(sig, ka,

[alpha] cleanup in bitops.h

2007-05-29 Thread Richard Henderson

Remove 2 functions private to the alpha implemetation, 
in favor of similar functions in .

Provide a more efficient version of the fls64 function
for pre-ev67 alphas.


r~



diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/pci_iommu.c 
linux-2.6.22-rc2/arch/alpha/kernel/pci_iommu.c
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/pci_iommu.c 
2007-04-25 20:08:32.0 -0700
+++ linux-2.6.22-rc2/arch/alpha/kernel/pci_iommu.c  2007-05-25 
15:42:32.0 -0700
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -53,7 +54,7 @@ size_for_memory(unsigned long max)
 {
unsigned long mem = max_low_pfn << PAGE_SHIFT;
if (mem < max)
-   max = 1UL << ceil_log2(mem);
+   max = roundup_pow_of_two(mem);
return max;
 }
 
diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/setup.c 
linux-2.6.22-rc2/arch/alpha/kernel/setup.c
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/kernel/setup.c 
2007-05-28 15:23:20.0 -0700
+++ linux-2.6.22-rc2/arch/alpha/kernel/setup.c  2007-05-25 15:41:20.0 
-0700
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 
 extern struct atomic_notifier_head panic_notifier_list;
 static int alpha_panic_event(struct notifier_block *, unsigned long, void *);
@@ -1303,7 +1304,7 @@ external_cache_probe(int minsize, int wi
long size = minsize, maxsize = MAX_BCACHE_SIZE * 2;
 
if (maxsize > (max_low_pfn + 1) << PAGE_SHIFT)
-   maxsize = 1 << (floor_log2(max_low_pfn + 1) + PAGE_SHIFT);
+   maxsize = 1 << (ilog2(max_low_pfn + 1) + PAGE_SHIFT);
 
/* Get the first block cached. */
read_mem_block(__va(0), stride, size);
diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/lib/Makefile 
linux-2.6.22-rc2/arch/alpha/lib/Makefile
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/lib/Makefile   
2007-05-28 15:23:20.0 -0700
+++ linux-2.6.22-rc2/arch/alpha/lib/Makefile2007-05-25 15:43:08.0 
-0700
@@ -37,7 +37,8 @@ lib-y =   __divqu.o __remqu.o __divlu.o __
$(ev6-y)clear_page.o \
$(ev6-y)copy_page.o \
fpreg.o \
-   callback_srm.o srm_puts.o srm_printk.o
+   callback_srm.o srm_puts.o srm_printk.o \
+   fls.o
 
 lib-$(CONFIG_SMP) += dec_and_lock.o
 
diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/lib/fls.c 
linux-2.6.22-rc2/arch/alpha/lib/fls.c
--- /home/rth/work/linux/linux-2.6.22-rc2/arch/alpha/lib/fls.c  1969-12-31 
16:00:00.0 -0800
+++ linux-2.6.22-rc2/arch/alpha/lib/fls.c   2007-05-28 16:06:46.0 
-0700
@@ -0,0 +1,38 @@
+/* 
+ * arch/alpha/lib/fls.c
+ */
+
+#include 
+#include 
+
+/* This is fls(x)-1, except zero is held to zero.  This allows most
+   efficent input into extbl, plus it allows easy handling of fls(0)=0.  */
+
+const unsigned char __flsm1_tab[256] = 
+{
+  0,
+  0,
+  1, 1,
+  2, 2, 2, 2,
+  3, 3, 3, 3, 3, 3, 3, 3,
+  4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
+
+  5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
+  5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
+
+  6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+  6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+  6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+  6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+  7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
+};
+
+EXPORT_SYMBOL(__flsm1_tab);
diff -ruNp /home/rth/work/linux/linux-2.6.22-rc2/include/asm-alpha/bitops.h 
linux-2.6.22-rc2/include/asm-alpha/bitops.h
--- /home/rth/work/linux/linux-2.6.22-rc2/include/asm-alpha/bitops.h
2007-04-25 20:08:32.0 -0700
+++ linux-2.6.22-rc2/include/asm-alpha/bitops.h 2007-05-28 16:05:18.0 
-0700
@@ -313,32 +313,29 @@ static inline int ffs(int word)
  * fls: find last bit set.
  */
 #if defined(CONFIG_ALPHA_EV6) && defined(CONFIG_ALPHA_EV67)
-static inline int fls(int word)
+static inline int fls64(unsigned long word)
 {
-   return 64 - __kernel_ctlz(word & 0x);
+   return 64 - __kernel_ctlz(word);
 }
 #else
-#include 
-#endif
-#include 
+extern const unsigned char __flsm1_tab[256];
 
-/* Compute powers of two for the given integer.  */
-static inline long floor_log2(unsigned long word)
+static inline int fls64(unsigned long x)
 {
-#if defined(CONFIG_ALPHA_EV6) && defined(CONFIG_ALPHA_EV67)
-   return 63 - __kernel_ctlz(word);
-#else
-   long bit;
-   for (bit = -1; word ; bit++)
-   word >>= 1;
-   return bit;
-#endif
+   unsigned long t, a, r;
+
+   t = __kernel_cmpbge (x, 0x0101010101010101);
+

Re: [PATCH] Kconfig powernow-k8 driver should depend on ACPI P-States driver

2007-05-29 Thread Daniel Drake


Dave Jones wrote:

The patch content looks ok to me, Daniel, ack?


Works for me, thanks.

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I 
decided to compare version 6 of the code against the *unsafe* decompressor 
again. The results of the three runs I've put it through after changing it to 
compare against the unsafe decompressor were startling. `Tiny's` compressor 
is still faster - I've seen it be rated up to 3% faster. The decompressor, 
OTOH, when compared to the unsafe version (which is the comparison that 
started me on this binge of hacking), is more than 7% worse. About 11% slower 
on the original test against a C source file, and about 6% slower for random 
data. However, looking at the numbers involved, I can't see a reason to keep 
the unsafe version around - the percentages look worse than they are - from 1 
to 3 microseconds. (well, the compressed-cache people might want those extra 
usecs - but the difference will never be noticeable anywhere outside the 
kernel)

DRH


lzo1x-test-6a.tar.bz2
Description: application/tbz

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread Pavel Machek

Hi!

> >>>If we want "/etc/shadow" to be the only way to access the shadow file
> >>>we could label the data with "/etc/shadow". Any attempts to access
> >>>this data using a renamed file or link would be denied (attempts to
> >>>link or rename could also be denied).
> >>Eloquently put.
> >>
> >>AppArmor actually does something similar to this, by mediating all of
> >>the ways that you can make an alias to a file. These are:
> >...
> >>* Hard links: AppArmor explicitly mediates permission to make a hard
> >
> >Unfortunately, aparmor is by design limited to subset of distro
> >(network daemons). Unfortunately, some other programs (passwd, vi)
> >routinely make hardlinks. So AA mediating hardlink is not enough, as
> >vi will happily hardlink /etc/shadow into /etc/.vi-shadow-1234.
> 
> but with the AA design of default deny this isn't a big problem unless you 
> specificly allow some network daemon to access /etc/.vi-shadow-1234

...or unless vi decides to hardlink into /tmp or something.

> no, this won't help you much against local users, but there are a _lot_ of 
> boxes out there with few, if any, local users who don't also have the root 
> password. AA helps the admin be safer when configuring netwrok daemons.

Hmm, I guess I'd love "it is useless on multiuser boxes" to become
standard part of AA advertising.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [AppArmor 01/41] Pass struct vfsmount to the inode_create LSM hook

2007-05-29 Thread david


On Tue, 29 May 2007, Pavel Machek wrote:


Hi!


If we want "/etc/shadow" to be the only way to access the shadow file
we could label the data with "/etc/shadow". Any attempts to access
this data using a renamed file or link would be denied (attempts to
link or rename could also be denied).

Eloquently put.

AppArmor actually does something similar to this, by mediating all of
the ways that you can make an alias to a file. These are:

...

* Hard links: AppArmor explicitly mediates permission to make a hard


Unfortunately, aparmor is by design limited to subset of distro
(network daemons). Unfortunately, some other programs (passwd, vi)
routinely make hardlinks. So AA mediating hardlink is not enough, as
vi will happily hardlink /etc/shadow into /etc/.vi-shadow-1234.


but with the AA design of default deny this isn't a big problem unless you 
specificly allow some network daemon to access /etc/.vi-shadow-1234


in this context passwd and vi are considered trusted processes, they are 
used _after_ you authenticate onto the box.


no, this won't help you much against local users, but there are a _lot_ of 
boxes out there with few, if any, local users who don't also have the root 
password. AA helps the admin be safer when configuring netwrok daemons.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] please pull infiniband.git

2007-05-29 Thread Roland Dreier

Linus, please pull from

master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
for-linus

This will get a few more nicely balanced ("55 insertions(+), 55
deletions(-)") 2.6.22-rc3 fixes, mostly for IPoIB connected mode:

Michael S. Tsirkin (2):
  IB/mthca: Fix handling of send CQE with error for QPs connected to SRQ
  IPoIB/cm: Fix performance regression on Mellanox

Roland Dreier (1):
  IB/mlx4: Fix last allocated object tracking in bitmap allocator

Sean Hefty (1):
  IB/cm: Fix stale connection detection

 drivers/infiniband/core/cm.c|   25 ++-
 drivers/infiniband/hw/mthca/mthca_qp.c  |6 +-
 drivers/infiniband/ulp/ipoib/ipoib.h|3 +-
 drivers/infiniband/ulp/ipoib/ipoib_cm.c |   74 +++
 drivers/net/mlx4/alloc.c|2 +-
 5 files changed, 55 insertions(+), 55 deletions(-)


diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e840434..40c004a 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1297,26 +1297,29 @@ static struct cm_id_private * cm_match_req(struct 
cm_work *work,
 
req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
 
-   /* Check for duplicate REQ and stale connections. */
+   /* Check for possible duplicate REQ. */
spin_lock_irqsave(, flags);
timewait_info = cm_insert_remote_id(cm_id_priv->timewait_info);
-   if (!timewait_info)
-   timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
-
if (timewait_info) {
cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
   timewait_info->work.remote_id);
-   cm_cleanup_timewait(cm_id_priv->timewait_info);
spin_unlock_irqrestore(, flags);
if (cur_cm_id_priv) {
cm_dup_req_handler(work, cur_cm_id_priv);
cm_deref_id(cur_cm_id_priv);
-   } else
-   cm_issue_rej(work->port, work->mad_recv_wc,
-IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
-NULL, 0);
-   listen_cm_id_priv = NULL;
-   goto out;
+   }
+   return NULL;
+   }
+
+   /* Check for stale connections. */
+   timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
+   if (timewait_info) {
+   cm_cleanup_timewait(cm_id_priv->timewait_info);
+   spin_unlock_irqrestore(, flags);
+   cm_issue_rej(work->port, work->mad_recv_wc,
+IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
+NULL, 0);
+   return NULL;
}
 
/* Find matching listen request. */
diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c 
b/drivers/infiniband/hw/mthca/mthca_qp.c
index 0276649..eef415b 100644
--- a/drivers/infiniband/hw/mthca/mthca_qp.c
+++ b/drivers/infiniband/hw/mthca/mthca_qp.c
@@ -2284,10 +2284,10 @@ void mthca_free_err_wqe(struct mthca_dev *dev, struct 
mthca_qp *qp, int is_send,
struct mthca_next_seg *next;
 
/*
-* For SRQs, all WQEs generate a CQE, so we're always at the
-* end of the doorbell chain.
+* For SRQs, all receive WQEs generate a CQE, so we're always
+* at the end of the doorbell chain.
 */
-   if (qp->ibqp.srq) {
+   if (qp->ibqp.srq && !is_send) {
*new_wqe = 0;
return;
}
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 158759e..285c143 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -156,7 +156,7 @@ struct ipoib_cm_data {
  * - and then invoke a Destroy QP or Reset QP.
  *
  * We use the second option and wait for a completion on the
- * rx_drain_qp before destroying QPs attached to our SRQ.
+ * same CQ before destroying QPs attached to our SRQ.
  */
 
 enum ipoib_cm_state {
@@ -199,7 +199,6 @@ struct ipoib_cm_dev_priv {
struct ib_srq  *srq;
struct ipoib_cm_rx_buf *srq_ring;
struct ib_cm_id*id;
-   struct ib_qp   *rx_drain_qp;   /* generates WR described in 
10.3.1 */
struct list_headpassive_ids;   /* state: LIVE */
struct list_headrx_error_list; /* state: ERROR */
struct list_headrx_flush_list; /* state: FLUSH, drain not 
started */
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index f133b56..076a0bb 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -69,8 +69,9 @@ static struct ib_qp_attr ipoib_cm_err_attr = {

Re: [patch -mm 1/1] remove useless tolower in isofs

2007-05-29 Thread Andrew Morton

On Mon, 28 May 2007 03:11:04 +
"young dave" <[EMAIL PROTECTED]> wrote:

> Hi,
> > And then there's the supercompact form.
> >
> > while (len--) {
> > hash = partial_name_hash(tolower(*name++), hash);
> > }
> >
> > But I do not like the last one at all. The first one is the best, because
> > it clearly separates the condition and iteration parts of the expression,
> > while STILL being only three lines long. Or two, if you omit the braces.
> > (But you shouldn't.)
> >
> 
> IMO, I like the last one, but I prefer to keep the original author's
> one, I only remove the unnecessary tolower function.
> What do you think about this , Andrew?
> 

Don't care much.  The code as it stands is suitably paranoid about 
buggy implementations of tolower() which evaluate their arg more
than once ;)

Your email client replaces tabs with spaces.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Syslets, Threadlets, generic AIO support, v6

2007-05-29 Thread Zach Brown

> You should pick up the kevent work :)

I haven't looked at it in a while but yes, it's "on the radar" :).

> Having async request and response rings would be quite useful, and most 
> closely match what is going on under the hood in the kernel and hardware.

Yeah, but I have lots of competing thoughts about this.

For the time being I'm focusing on simplifying the mechanisms that
support the sys_io_*() interface so I never ever have to debug fs/aio.c
(also known as chewing glass to those of us with the scars) again.

That said, I'll gladly work closely with developers who are seriously
considering putting some next gen interface to the test.  That todo item
about producing documentation and distro kernels is specifically to bait
Uli into trying to implement posix aio on top of syslets in glibc.

'cause we can go back and forth about potential interfaces for, well,
how long as it been?  years?  I want non-trivial users who we can
measure so we can *stop* designing and implementing the moment something
is good enough for them.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-29 Thread Justin Piszcz


Short Description of Problem:
Linux 2.6.21.3 does not run properly with 8GB of ram on the Intel 965WH 
motherboard.


Long Description of Problem:
When I use 8GB of memory on my x86_64 system, CPU-bound processes are VERY
slow, up to 36x slower than usual.  My temporary fix is force Linux to only
use 4GB of memory, I am currently using mem=4096M.  I ran memtest86 and the
memory is fine, not a single error.  I tried the following to mem= 1024, 2048
4096 and blank "" to let the kernel use all 8GB of memory.  What is wrong
with the kernel and how come it cannot use 8GB of memory without slowing down
all CPU-related processes to a snail-like pace?  There is something horribly
wrong here.

Specifications:
Intel Motherboard: 965WH
Linux Kernel: 2.6.21.3
Distribution: Debian Testing x86_64
GCC: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
Target: x86_64-linux-gnu

Tests:

1. append line = 1024M
top - 18:28:26 up 1 min,  4 users,  load average: 0.42, 0.17, 0.06
Tasks: 157 total,   1 running, 156 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1027016k total,   964288k used,62728k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   105168k cached
---> STATUS: No problems, box is fine, no lag, etc..

2. append line = 2048M
top - 18:34:23 up 2 min,  2 users,  load average: 0.14, 0.14, 0.05
Tasks: 147 total,   1 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  1.2%sy,  0.4%ni, 95.2%id,  1.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2059696k total,   956324k used,  1103372k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   102924k cached
---> STATUS: No problems, box is fine, no lag, etc..

3. append line = 4096M
top - 18:37:55 up 1 min,  1 user,  load average: 0.52, 0.19, 0.07
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.9%us,  2.2%sy,  0.7%ni, 91.6%id,  2.6%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3339536k total,   949792k used,  2389744k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,99920k cached

$ time ssh p34 uptime
 19:00:16 up 1 min,  1 user,  load average: 0.67, 0.18, 0.06
real0m0.159s
user0m0.013s
sys 0m0.003s
---> STATUS: No problems, box is fine, no lag, etc..

4. append line = "" (use all 8GB)

top - 18:52:50 up 9 min,  1 user,  load average: 2.88, 2.43, 1.41
Tasks: 149 total,   3 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s): 36.3%us,  2.2%sy, 10.3%ni, 50.8%id,  0.4%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   8104460k total,  1064416k used,  7040044k free, 3296k buffers
Swap: 16787768k total,0k used, 16787768k free,   201852k cached

$ ssh p34
ssh: connect to host p34 port 22: Connection refused

Machine takes 5-10 minutes to boot, it acts like a 286 computer, about 8 
minutes later:


$ time ssh p34 uptime  # 5 SECONDS!! 36x slower when using 8GB of RAM
 18:51:39 up 8 min,  1 user,  load average: 2.74, 2.31, 1.30

real0m5.757s
user0m0.015s
sys 0m0.004s

The machine is VERY slow and this is on a gigabit network, I/O does not seem 
to be affected but rather, CPU-bound processes.


  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2483 root  25   0 25324 5292 1072 R   96  0.1   4:37.12 mailgraph
 3604 logcheck  30  10  3408 1120  544 R   91  0.0   0:03.55 grep

These normally take seconds but when I use all 8GB of memory, it runs
for a very long time.

Conclusion: For now, I will be using mem=4096M until someone can help me 
understand what is happening here.  Can anyone offer any insight?


I found it interesting in make menuconfig on x86_64 there is no 4GB/64GB
options in the kernel that I remember seeing in 32bit.

The output of cat /proc/cpuinfo is shown below and my .config is attached:

p34:~# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Quad CPU   @ 2.40GHz
stepping: 7
cpu MHz : 2397.606
cache size  : 4096 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc

 pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips: 4797.72
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Quad CPU   @ 2.40GHz
stepping: 7
cpu MHz : 2397.606
cache size  : 4096 KB
physical id : 0
siblings: 4
core id : 2
cpu cores   : 4
fpu : yes
fpu_exception   : yes
cpuid level

Re: [BUG] Something goes wrong with timer statistics.

2007-05-29 Thread David Miller

From: Ian Kumlien <[EMAIL PROTECTED]>
Date: Wed, 30 May 2007 00:51:52 +0200

> On tis, 2007-05-29 at 15:44 -0700, David Miller wrote:
> > From: "Michal Piotrowski" <[EMAIL PROTECTED]>
> > Date: Wed, 30 May 2007 00:41:46 +0200
> > 
> > > On 30/05/07, David Miller <[EMAIL PROTECTED]> wrote:
> > > > From: Ian Kumlien <[EMAIL PROTECTED]>
> > > > Date: Tue, 29 May 2007 23:38:48 +0200
> > > >
> > > > > As the daystar sets, i try to play some with my new would be
> > > > > firewall/server, but since this will be running for quite some time i
> > > > > have been experimenting with powertop to find out what i can do to 
> > > > > limit
> > > > > it's power usage.
> > > > >
> > > > > But, if i run powertop for too long or a few times to many... this
> > > > > happens:
> > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > > >
> > > > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > > > running memtest for hours etc etc...
> > > >
> > > > I see this same exact problem on sparc64.
> > > 
> > > Have you tried this patch?
> > > http://lkml.org/lkml/2007/5/29/392
> > 
> > Of course, I worked with Thomas on the fix :-)
> 
> I dunno if this applies to me, i run a 32 bit userspace...

That patch fixes a problem on 64-bit kernels, regardless of
userspace, when NOHZ is enabled.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] NOHZ: prevent multiplication overflow - stop timer for huge timeouts

2007-05-29 Thread David Miller

From: Thomas Gleixner <[EMAIL PROTECTED]>
Date: Tue, 29 May 2007 23:47:39 +0200

> get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case
> there is no timer pending. On 64 bit machines this results in a
> multiplication overflow in tick_nohz_stop_sched_tick(). 
> 
> Reported by: Dave Miller <[EMAIL PROTECTED]>
> 
> Make the return value a constant and limit the return value to a 32 bit
> value.
> 
> When the max timeout value is returned, we can safely stop the tick
> timer device. The max jiffies delta results in a 12 days timeout for
> HZ=1000.
> 
> In the long term the get_next_timer_interrupt() code needs to be
> reworked to return ktime instead of jiffies, but we have to wait until
> the last users of the original NO_IDLE_HZ code are converted.
> 
> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Acked-off-by: David S. Miller <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano

On Tue, 2007-05-29 at 18:39 -0400, Jeff Garzik wrote:

> We check for 0x because that is often how a fault is indicated, 
> when the memory location is read during or immediately after hotplug (or 
> if the PCI bus is truly faulty).  So for most hardware, you see
> 
> tmp = read(irq status)
> if (!tmp)
>   return irq-none /* no irq events raised */
> if (tmp == 0x)
>   return irq-none /* hot unplug or h/w fault */
> 
> and the method that determines no interrupt handling is needed.
> 

I guess you are right, but then shouldn't the driver be checking for
faults in other parts of the code too? What if a fault/hotplug occurs
immediately after an interrupt, but before a tx?
Thanks,
Gary

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix broken CLIR in isdn driver

2007-05-29 Thread Karsten Keil

I noticed that CLIR (aka "hide your calling number") in isdn_tty is broken:
The at-command parser filters out the required "R" (e.g. ATDR089123456)
It's been broken for a *very* long time.

Signed-off-by: Karsten Keil <[EMAIL PROTECTED]>
Signed-off-by: Matthias Goebl <[EMAIL PROTECTED]>
---
 drivers/isdn/i4l/isdn_tty.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index ea5f30d..4e5f87c 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -2693,8 +2693,9 @@ isdn_tty_getdial(char *p, char *q,int cnt)
int limit = ISDN_MSNLEN - 1;/* MUST match the size of interface var 
to avoid
buffer overflow */
 
-   while (strchr(" 0123456789,#.*WPTS-", *p) && *p && --cnt>0) {
+   while (strchr(" 0123456789,#.*WPTSR-", *p) && *p && --cnt>0) {
if ((*p >= '0' && *p <= '9') || ((*p == 'S') && first) ||
+   ((*p == 'R') && first) ||
(*p == '*') || (*p == '#')) {
*q++ = *p;
limit--;


-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 
16746 (AG Nuernberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1032 matches

Mail list logo