date:20070818

Re: [PATCH] missing return in bridge sysfs code

2007-08-18 Thread David Miller

From: Al Viro <[EMAIL PROTECTED]>
Date: Sun, 19 Aug 2007 04:51:26 +0100

> 
> Signed-off-by: Al Viro <[EMAIL PROTECTED]>

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

Thanks for catching this Al.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

The vi editor causes brain damage

2007-08-18 Thread Marc Perkel

Let me give you and example of the difference between
Linux open source world brain damaged thinking and
what it's like out here in the real world.

Go to a directory with 10k files and type:

rm *

What do you get?

/bin/rm: Argument list too long

If you map a network drive in DOS and type:

del *

It works.

That's the problem with the type of thinking in the
open source world. Why can DOS delete an infinite
number of files and rm can't? Because rm was written
using the "vi" editor and it causes brain damage and
that's why after 20 years rm hasn't caught up with
del.

Before everyone gets pissed off and freaks out why
don't you ponder the question why rm won't delete all
the files in the directory. If you can't grasp that
then you're brain damaged.

Think big people. Say NO to vi!


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


   

Yahoo! oneSearch: Finally, mobile search 
that gives answers, not web links. 
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

general protection fault with powernow-k8 frequency scaling on x86-64

2007-08-18 Thread Hamish Moffatt

I'm running an Athlon64 x2 system which regularly issues general
protection faults and later crashes whenever frequency scaling is
enabled.

If the system runs at full frequency all the time (eg "performance"
govenor), even with powernow-k8 and other governors loaded, it is
perfectly stable. 

When ondemand or userspace is loaded and the system switches down, 
it will crash within the next few hours. It is generally idle; I can't
see anything in particular that causes it to crash. The crashes seem to
occur in pretty much random processes.

I'm running 2.6.22.1 (Debian's package) although this started happening
at first with about 2.6.17. I've seen other reports on lkml about this
problem occasionally but no solution. eg this one from 2.6.13, although
I never saw it that early:
http://lkml.org/lkml/2005/9/2/131

Here's an example fault; a few more occur a few minutes after that, then
the whole system locks up soon after. The complete kernel log is below.

Any ideas?

thanks
Hamish


Aug 19 06:25:32 noddy kernel: stack segment:  [1] SMP 
Aug 19 06:25:32 noddy kernel: CPU 1 
Aug 19 06:25:32 noddy kernel: Modules linked in: tcp_diag inet_diag binfmt_misc 
rfcomm l2cap bluetooth lirc_dev tun powernow_k8 processor cpufreq_userspace 
cpufreq_stats cpufreq_powersave cpufreq_ondemand freq_table 
cpufreq_conservative ipt_ULOG ipt_recent nf_conntrack_ipv4 xt_state 
nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables mkiss ax25 
crc16 ipv6 parport_serial parport_pc dm_snapshot dm_mirror dm_mod it87 
hwmon_vid i2c_isa lp parport ide_disk ide_generic snd_ens1371 snd_seq_dummy 
snd_seq_oss firewire_ohci snd_seq_midi snd_seq_midi_event snd_seq firewire_core 
crc_itu_t snd_rawmidi skge sata_sil snd_seq_device ide_cd cdrom snd_intel8x0 
snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd 
soundcore shpchp pci_hotplug forcedeth k8temp snd_page_alloc i2c_nforce2 
ehci_hcd ohci_hcd psmouse i2c_core amd74xx serio_raw analog gameport joydev 
tsdev pcspkr floppy ext3 jbd mbcache raid1 md_mod sd_mod ata_generic sata_nv 
libata scsi_mod generic ide_core evdev
Aug 19 06:25:32 noddy kernel: Pid: 5200, comm: find Not tainted 2.6.22-1-amd64 
#1
Aug 19 06:25:32 noddy kernel: RIP: 0010:[]  
[] __d_lookup+0xdb/0x100
Aug 19 06:25:32 noddy kernel: RSP: 0018:810065a47bf8  EFLAGS: 00010286
Aug 19 06:25:32 noddy kernel: RAX: c200 RBX: 81007fd041ed RCX: 
0012
Aug 19 06:25:32 noddy kernel: RDX: 0003e761 RSI: 00c3b74333bfe761 RDI: 
8100611c87d0
Aug 19 06:25:32 noddy kernel: RBP: ccff810064d7f578 R08:  R09: 

Aug 19 06:25:32 noddy kernel: R10: 2f53454741535345 R11: 802d3f8e R12: 
8100611c87d0
Aug 19 06:25:32 noddy kernel: R13: 810065a47cb8 R14: 2941d172 R15: 
0009
Aug 19 06:25:32 noddy kernel: FS:  2afa831016e0() 
GS:81007fd06cc0() knlGS:
Aug 19 06:25:32 noddy kernel: CS:  0010 DS:  ES:  CR0: 8005003b
Aug 19 06:25:32 noddy kernel: CR2: 2b9c5c009236 CR3: 65a74000 CR4: 
06e0
Aug 19 06:25:32 noddy kernel: Process find (pid: 5200, threadinfo 
810065a46000, task 81006f896200)
Aug 19 06:25:32 noddy kernel: Stack:  810065ae6000 81007fd041ed 
810065a47e48 8100611ca448
Aug 19 06:25:32 noddy kernel:  81007fcca680 810065a47e48 
810065a47cb8 8028ae0e
Aug 19 06:25:32 noddy kernel:  8100611c41ed 810065a47cc8 
81007fcca680 81007fd041ed
Aug 19 06:25:32 noddy kernel: Call Trace:
Aug 19 06:25:32 noddy kernel:  [] do_lookup+0x2a/0x1ae
Aug 19 06:25:32 noddy kernel:  [] __link_path_walk+0x8ec/0xd9d
Aug 19 06:25:32 noddy kernel:  [] link_path_walk+0x58/0xe0
Aug 19 06:25:32 noddy kernel:  [] do_path_lookup+0x1a0/0x1c3
Aug 19 06:25:32 noddy kernel:  [] getname+0x14c/0x190
Aug 19 06:25:32 noddy kernel:  [] __user_walk_fd+0x37/0x53
Aug 19 06:25:32 noddy kernel:  [] vfs_lstat_fd+0x18/0x47
Aug 19 06:25:32 noddy kernel:  [] sys_newlstat+0x19/0x31
Aug 19 06:25:32 noddy kernel:  [] sys_close+0x8c/0xc9
Aug 19 06:25:32 noddy kernel:  [] system_call+0x7e/0x83
Aug 19 06:25:32 noddy kernel: 
Aug 19 06:25:32 noddy kernel: 
Aug 19 06:25:32 noddy kernel: Code: 48 8b 45 00 0f 18 08 48 8d 5d e8 44 39 73 
30 75 e6 e9 70 ff 
Aug 19 06:25:32 noddy kernel: RIP  [] __d_lookup+0xdb/0x100
Aug 19 06:25:32 noddy kernel:  RSP 


And the complete log..

Aug 19 01:43:28 noddy kernel: klogd 1.5.0#1, log source = /proc/kmsg started.
Aug 19 01:43:28 noddy kernel: Linux version 2.6.22-1-amd64 (Debian 
2.6.22-trunk.1~snapshot.9252) ([EMAIL PROTECTED]) (gcc version 4.1.3 20070718 
(prerelease) (Debian 4.1.2-14+1)) #1 SMP Sat Aug 4 00:40:03 UTC 2007
Aug 19 01:43:28 noddy kernel: Command line: root=/dev/md1 ro console=tty0 
8250.nr_uarts=6 vga=795 3
Aug 19 01:43:28 noddy kernel: BIOS-provided physical RAM map:
Aug 19 01:43:28 noddy kernel:  BIOS-e820:  - 0009f400 
(usable)
Aug 19 01:43:28 noddy kernel:

Re: Thinking outside the box on file systems

2007-08-18 Thread Marc Perkel

No Al, there isn't any shortage of arrogance here.

Let me try to repeat what I'm talking about as simply
as I can.

First - I'm describing a kind of functionality and
suggesting Linux should have it. I know a lot of it
can be done because much of what I'm suggesting is
already working in Windows and Netware.

I'm not the one who's going to code it. I'm just
saying that it would be nice if Linux had the
functionality of other operating systems - and - take
it to the next level - match it and do even better.

As to thinking outside the box, what I'm proposing is
outside the box relative to Linux. It's not as
original as compared to Windows or Netware which is
even better.

The idea is that Linux is lacking features that other
OSs have. What I'm suggesting is that Linux not only
match it but to create an even more powerful rights
layer that is more powerful than the rest and I'm
outlining a concept in the hopes that people would get
excited about the concept and want to build on the
idea.

I'm just telling you what I'd like to see. I'm not
going to code it. So I'm only going to talk about what
is possible. How it's done will be up to any
programmers who might be inspired by the idea. If no
one is inspired the Linux will continue to be in last
place when it comes to file system features relating
to fine grain permissions.

In Linux, for example, users are allowed to delete
files that they are prohibited from reading or
writing. In Netware if a user can't read or write to
the file they won't even be able to see that the file
exists, let alone delete it.

In Netware I can move a directory tree into another
tree and the objects that have rights in the other
tree will have rights to all the new files without
having to run utilities on the command line to
recursively change the permission afterwards. 

The point - Linux isn't going to move forward and
catch up unless there is a fundamental change in the
thinking  behind Linux permissions. There is a
cultural lack of innovation here. I discussed this
with Andrew Morton and he made some suggestions but
there's real hostility towards new concepts here.
Something I don't understand. At some point Linux
needs to grow beyond just being an evolved Unix clone
and that's not going to happen if you don't think
differently.

I still believe that the VI editor causes brain
damage. :)




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/ 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] missing return in bridge sysfs code

2007-08-18 Thread Al Viro


Signed-off-by: Al Viro <[EMAIL PROTECTED]>
---
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 88f4300..c65f54e 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -167,6 +167,7 @@ static ssize_t store_stp_state(struct device *d,
br_stp_set_enabled(br, val);
rtnl_unlock();
 
+   return len;
 }
 static DEVICE_ATTR(stp_state, S_IRUGO | S_IWUSR, show_stp_state,
   store_stp_state);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: skip paravirt patching when appropriate

2007-08-18 Thread Zachary Amsden


Chris Wright wrote:

* Chris Wright ([EMAIL PROTECTED]) wrote:
  

Now that I understand the problem, I do have a very simple (slightly
overkill) fix for paravirt patching.  This can be cleaned up to avoid
the copies when they aren't needed, but that will take a little more
auditing of the various patchers.  If you still prefer a revert I've
got one handy, and we can re-visit this all post .23.



This one avoids the patching when necessary, but needs some validation on
VMI (and Zach's not avail today).  I'll resend when I know it's working
for all paravirt patchers.
  


I'm back.  I'll give this one a spin after dinner tonight.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread Al Viro

On Sat, Aug 18, 2007 at 07:03:06PM -0700, [EMAIL PROTECTED] wrote:
> >>I suspect you will find it somewhat hard to convince *anybody* on
> >>this list to put either a regex engine or a Perl interpreter into the
> >>kernel.  I doubt you could even get a simple shell-style pattern
> >>matcher in.  First of all, both of the former chew up enormous gobs
> >>of stack space *AND* they're NP-complete.

Eh?  regex via NFA is O(expression size * string length) time and
O(expression size) space.  If you can show that regex matching is
NP-complete, you've got a good shot at Nevanlinna Prize...

Not that it made regex in kernel a good idea, but fair is fair -
unless you can show any mentioning of backrefs upthread...[1]

>  You just can't do such
> >>matching even in polynomial time, let alone something that scales
> >>appropriately for an OS kernel like, say, O(log(n)).
> >
> >Already been done.  Take a look at "AppArmor" aka "Immunix".
> 
> don't forget the ACPI interpreter.

YAProof that bogons follow Boze statistics...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread david


On Sat, 18 Aug 2007, Alan wrote:


On Wed, 2007-08-15 at 13:22 -0400, Kyle Moffett wrote:

On Aug 15, 2007, at 13:09:31, Marc Perkel wrote:

The idea is that people have permissions - not files.  By people I
mean users, groups, managers, applications
etc. One might even specify that there are no permission
restrictions at all. Part of the process would be that the kernel
load what code it will use for the permission system. It might even
be a little perl script you write.

Also - you aren't even giving permission to access files. It's
permission to access name patterns. One could apply REGEX masks to
names to determine permissions. So if you have permission to the
name you have permission to the file.


Please excuse me, I'm going to go stand over in the corner for a minute.

*hahahahahaa hahahahahaaa hahaa hoo hee snicker sniff*

*wanders back into the conversation*

Sorry about that, pardon me.

I suspect you will find it somewhat hard to convince *anybody* on
this list to put either a regex engine or a Perl interpreter into the
kernel.  I doubt you could even get a simple shell-style pattern
matcher in.  First of all, both of the former chew up enormous gobs
of stack space *AND* they're NP-complete.  You just can't do such
matching even in polynomial time, let alone something that scales
appropriately for an OS kernel like, say, O(log(n)).


Already been done.  Take a look at "AppArmor" aka "Immunix".


don't forget the ACPI interpreter.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

TIMER0 interrupt's problem in kernel 2.6.21 ( realview-smp_defconfig)

2007-08-18 Thread Xu Yang

I am running kernel 2.6.21 on my realview_eb_mpcore system.

at the beginning I found the program is stuck at calibrating
loop, this is because that the timer0 didn't give out
any interrupt.

after enable the timer0's interrupt, I found that even when there is
no timer interrupt the interrupt routine (realview_timer_interrupt()
)is still reached.

I added a timer interrupt disable instruction in the routine
realview_timer_interrupt() , which makes sure that after the first
interrupt the interrupt would be disabled. even in this situation , I
can still see that the interrupt routine is accessed by the cpu.

I guess this maybe caused by the problem of the gic_ack() or some
other functions?

does anyone ever meet such a situation?

anyone knows how to handle this?

any idea is  appreciated.

regards,

Yang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] maps: /proc//pmaps interface - memory maps in granularity of pages

2007-08-18 Thread Fengguang Wu

On Sat, Aug 18, 2007 at 12:22:26PM -0500, Matt Mackall wrote:
> > > > So VSZ:RSS ratio actually goes up with memory pressure.
> > > 
> > > And yes.
> > > 
> > > But that's not what I'm talking about. You're likely to have more
> > > holes in your ranges with memory pressure as things that aren't active
> > > get paged or swapped out and back in. And because we're walking the
> > > LRU more rapidly, we'll flip over a lot of the active bits more often
> > > which will mean more output.
> > > 
> > > >   - page range is a good unit of locality. They are more likely to be
> > > > reclaimed as a whole. So (RSS:page_ranges) wouldn't degrade as much.
> > > 
> > > There is that. The relative magnitude of the different effects is
> > > unclear. But it is clear that the worst case for pmap is much worse
> > 
> > > than pagemap (two lines per page of RSS?). 
> > It's one line per page. No sane app will make vmas proliferate.
> 
> Sane apps are few and far between.

Very likely, and they will bloat maps/smaps/pmaps alike :(

> > So let's talk about the worst case.
> > 
> > pagemap's data set size is determined by VSZ.
> > 4GB VSZ means 1M PFNs, hence 8MB pagemap data.
> > 
> > pmaps's data set size is bounded by RSS hence physical memory.
> > 4GB RSS means up to 1M page ranges, hence ~20M pmaps data.
> > Not too bad :)
> 
> Hmmm, I've been misreading the output.
> 
> What does it do with nonlinear VMAs?

The implementation gets offset from page_index(page), so will work
the same way in linear/nonlinear VMAs. Depending how one does the
remap_file_ranges() calls, the output lines may be not strictly
ordered by offset, or overlap, or have small page ranges. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: tracking MAINTAINERS versus tracking SUBSYSTEMS

2007-08-18 Thread Joe Perches

On Sat, 2007-08-18 at 13:35 -0400, Robert P. J. Day wrote:
>   $ show_subsystem drivers/bluetooth/bpa10x.c
>   BLUETOOTH

"what's a subsystem"?
I'm not sure there is an appropriate definition.
If there is an appropriate definition, why
should anyone care what subsystem a particular
file is in?

> 1) it reduces the MAINTAINERS file back to what it should be in the
> first place -- a simple reference list of each kernel subsystem, and
> who's responsible for it, so that constant reshuffling of files or
> directories in a particular subsystem doesn't require constant
> updating of the MAINTAINERS file.

I'd still be happy if MAINTAINERS went away.
I'm not sure what good it does other than have
a link to mailing lists that otherwise might not
be CC'd on patches.

> thoughts?

For user submission of bug reports, perhaps it'd be
more useful to have a submitkernelbugreport script
and a network enabled "dispatch bug report" to
appropriate maintainers service.

Try something out and see what happens.

Good luck.  Joe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] soft-fp.h tpyo

2007-08-18 Thread Al Viro

Signed-off-by: Al Viro <[EMAIL PROTECTED]>
---
diff --git a/include/math-emu/soft-fp.h b/include/math-emu/soft-fp.h
index a0721ef..a6f873b 100644
--- a/include/math-emu/soft-fp.h
+++ b/include/math-emu/soft-fp.h
@@ -98,7 +98,7 @@
 #endif
 
 #ifndef FP_TRAPPING_EXCEPTIONS
-#define FP_TRAPPING_EXCPETIONS 0
+#define FP_TRAPPING_EXCEPTIONS 0
 #endif
 
 #define FP_SET_EXCEPTION(ex)   \
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Git User's Survey 2007

2007-08-18 Thread Jakub Narebski

Hi all,

We would like to ask you a few questions about your use of the GIT
version control system. This survey is mainly to understand who is
using GIT, how and why.

The results will be discussed on the git mailing list and published to 
the GIT wiki at http://git.or.cz/gitwiki/GitSurvey2007

We'll close the survey in three weeks starting from 20 August 2007,
on 10 September 2007.

Please devote a few minutes of your time to fill this simple
questionnaire, it will help a lot the git community to understand your
needs, what you like of GIT, and of course what you don't like  of it.

The survey can be found here:
  http://www.survey.net.nz/survey.php?94e135ff41e871a1ea5bcda3ee1856d9
  http://tinyurl.com/26774s

-- 
Jakub Narebski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] exec: kill unsafe BUG_ON(sig->count) checks

2007-08-18 Thread Roland McGrath

Those BUG_ON's were there because of past bugs and fragility in the
de_thread and exit synchronization stuff.  There's no real need to leave
them (in fixed form) if we are confident of that stuff working right now,
or have other assertions to give us that confidence when we change it.


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread Alan

On Wed, 2007-08-15 at 10:34 -0700, Marc Perkel wrote:

> Keep in mind that this is about thinking outside the
> box. Don't let new ideas scare you.

My cat thinks outside the box all the time.  Cleaning it up is a real
pain.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC,PATCH 5/5] exec: RT sub-thread can livelock and monopolize CPU on exec

2007-08-18 Thread Roland McGrath

Maybe it can use wait_task_inactive, which IIUC is being changed to address
the same RT issue.  OTOH, notify_count exists only for this.  So maybe the
better way is to clean that whole mechanism up somehow.  The exit.c changes
in your patch seem to be making it more mysterious rather than less so.
I haven't really thought much about the better solution.


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread Alan

On Wed, 2007-08-15 at 13:22 -0400, Kyle Moffett wrote:
> On Aug 15, 2007, at 13:09:31, Marc Perkel wrote:
> > The idea is that people have permissions - not files.  By people I  
> > mean users, groups, managers, applications
> > etc. One might even specify that there are no permission  
> > restrictions at all. Part of the process would be that the kernel  
> > load what code it will use for the permission system. It might even  
> > be a little perl script you write.
> >
> > Also - you aren't even giving permission to access files. It's  
> > permission to access name patterns. One could apply REGEX masks to  
> > names to determine permissions. So if you have permission to the  
> > name you have permission to the file.
> 
> Please excuse me, I'm going to go stand over in the corner for a minute.
> 
> *hahahahahaa hahahahahaaa hahaa hoo hee snicker sniff*
> 
> *wanders back into the conversation*
> 
> Sorry about that, pardon me.
> 
> I suspect you will find it somewhat hard to convince *anybody* on  
> this list to put either a regex engine or a Perl interpreter into the  
> kernel.  I doubt you could even get a simple shell-style pattern  
> matcher in.  First of all, both of the former chew up enormous gobs  
> of stack space *AND* they're NP-complete.  You just can't do such  
> matching even in polynomial time, let alone something that scales  
> appropriately for an OS kernel like, say, O(log(n)).

Already been done.  Take a look at "AppArmor" aka "Immunix".


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + proc-export-a-processes-resource-limits-via-proc-pid.patch added to -mm tree

2007-08-18 Thread Neil Horman

> 
> Neil, please, don't add tasklist_lock again. It was not easy to wipe it from
> fs/proc/ :) Just change this code to use rcu_read_lock().
> 

Ok, done/tested.  Thanks!

Currently, there exists no method for a process to query the resource
limits of another process.  They can be inferred via some mechanisms but they
cannot be explicitly determined.  Given that this information can be usefull to
know during the debugging of an application, I've written this patch which
exports all of a processes limits via /proc//limits.  Tested successfully
by myself on x86 on top of 2.6.23-rc2-mm1.


Signed-off-by: Neil Horman <[EMAIL PROTECTED]>


 base.c |   77 +
 1 file changed, 77 insertions(+)


diff --git a/fs/proc/base.c b/fs/proc/base.c
index ed2b224..86130b0 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 
 /* NOTE:
@@ -323,6 +324,80 @@ static int proc_oom_score(struct task_struct *task, char 
*buffer)
return sprintf(buffer, "%lu\n", points);
 }
 
+struct limit_names {
+   char *name;
+   char *unit;
+};
+
+static const struct limit_names lnames[RLIM_NLIMITS] = {
+   [RLIMIT_CPU] = {"Max cpu time", "ms"},
+   [RLIMIT_FSIZE] = {"Max file size", "bytes"},
+   [RLIMIT_DATA] = {"Max data size", "bytes"},
+   [RLIMIT_STACK] = {"Max stack size", "bytes"},
+   [RLIMIT_CORE] = {"Max core file size", "bytes"},
+   [RLIMIT_RSS] = {"Max resident set", "bytes"},
+   [RLIMIT_NPROC] = {"Max processes", "processes"},
+   [RLIMIT_NOFILE] = {"Max open files", "files"},
+   [RLIMIT_MEMLOCK] = {"Max locked memory", "bytes"},
+   [RLIMIT_AS] = {"Max address space", "bytes"},
+   [RLIMIT_LOCKS] = {"Max file locks", "locks"},
+   [RLIMIT_SIGPENDING] = {"Max pending signals", "signals"},
+   [RLIMIT_MSGQUEUE] = {"Max msgqueue size", "bytes"},
+   [RLIMIT_NICE] = {"Max nice priority", NULL},
+   [RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
+};
+
+/* Display limits for a process */
+static int proc_pid_limits(struct task_struct *task, char *buffer)
+{
+   unsigned int i;
+   int count = 0;
+   unsigned long flags;
+   char *bufptr = buffer;
+
+   struct rlimit rlim[RLIM_NLIMITS];
+
+   rcu_read_lock();
+   lock_task_sighand(task,);
+   if (task->signal == NULL){
+   unlock_task_sighand(task, );
+   rcu_read_unlock();
+   return 0;
+   }
+   memcpy(rlim, task->signal->rlim, sizeof(struct rlimit) * RLIM_NLIMITS);
+   unlock_task_sighand(task, );
+   rcu_read_unlock();
+
+   /*
+* print the file header
+*/
+   count += sprintf([count], "%-25s %-20s %-20s %-10s\n",
+   "Limit", "Soft Limit", "Hard Limit", "Units");
+
+   for (i = 0; i < RLIM_NLIMITS; i++) {
+   if (rlim[i].rlim_cur == RLIM_INFINITY)
+   count += sprintf([count], "%-25s %-20s ",
+lnames[i].name, "unlimited");
+   else
+   count += sprintf([count], "%-25s %-20lu ",
+lnames[i].name, rlim[i].rlim_cur);
+
+   if (rlim[i].rlim_max == RLIM_INFINITY)
+   count += sprintf([count], "%-20s ", "unlimited");
+   else
+   count += sprintf([count], "%-20lu ",
+rlim[i].rlim_max);
+
+   if (lnames[i].unit)
+   count += sprintf([count], "%-10s\n",
+lnames[i].unit);
+   else
+   count += sprintf([count], "\n");
+   }
+
+   return count;
+}
+
 //
 /*   Here the fs part begins*/
 //
@@ -2017,6 +2092,7 @@ static const struct pid_entry tgid_base_stuff[] = {
INF("environ",S_IRUSR, pid_environ),
INF("auxv",   S_IRUSR, pid_auxv),
INF("status", S_IRUGO, pid_status),
+   INF("limits", S_IRUSR, pid_limits),
 #ifdef CONFIG_SCHED_DEBUG
REG("sched",  S_IRUGO|S_IWUSR, pid_sched),
 #endif
@@ -2310,6 +2386,7 @@ static const struct pid_entry tid_base_stuff[] = {
INF("environ",   S_IRUSR, pid_environ),
INF("auxv",  S_IRUSR, pid_auxv),
INF("status",S_IRUGO, pid_status),
+   INF("limits",S_IRUSR, pid_limits),
 #ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, pid_sched),
 #endif
-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Paul E. McKenney

On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> > 
> > One of the gcc guys claimed that he thought that the two-instruction
> > sequence would be faster on some x86 machines.  I pointed out that
> > there might be a concern about code size.  I chose not to point out
> > that people might also care about the other x86 machines.  ;-)
> 
> Some (very few) x86 uarchs do tend to prefer "load-store" like code 
> generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
> actually be faster on some of them. Not any that are relevant today, 
> though.

;-)

> Also, that has nothing to do with volatile, and should be controlled by 
> optimization flags (like -mtune). In fact, I thought there was a separate 
> flag to do that (ie something like "-mload-store"), but I can't find it, 
> so maybe that's just my fevered brain..

Good point, will suggest this if the need arises.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] exec: simplify the new ->sighand allocation

2007-08-18 Thread Roland McGrath

>   - ENOMEM still can happen after de_thread(), ->sighand is not the last
> object we have to allocate

As long as this is true, I think it's the incontrovertible argument there
is no reason not to simplify it.  (Not that I don't also agree with your
other reasons).


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] exec: simplify ->sighand switching

2007-08-18 Thread Roland McGrath

> There is no any reason to do recalc_sigpending() after changing ->sighand.
> To begin with, recalc_sigpending() does not take ->sighand into account.

I agree.  I think that call dates from before some other cleanups in that
code, when ->signal was changed there.  At the time, it was the most
conservative change to leave recalc_sigpending in case the side effects it
used to have were desireable (we've since decided to change those anyway).


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Linus Torvalds

On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> 
> One of the gcc guys claimed that he thought that the two-instruction
> sequence would be faster on some x86 machines.  I pointed out that
> there might be a concern about code size.  I chose not to point out
> that people might also care about the other x86 machines.  ;-)

Some (very few) x86 uarchs do tend to prefer "load-store" like code 
generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
actually be faster on some of them. Not any that are relevant today, 
though.

Also, that has nothing to do with volatile, and should be controlled by 
optimization flags (like -mtune). In fact, I thought there was a separate 
flag to do that (ie something like "-mload-store"), but I can't find it, 
so maybe that's just my fevered brain..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] exec: kill unsafe BUG_ON(sig->count) checks

2007-08-18 Thread Paul E. McKenney

On Sat, Aug 18, 2007 at 09:39:36PM +0400, Oleg Nesterov wrote:
> de_thread:
> 
>   if (atomic_read(>count) <= 1)
>   BUG_ON(atomic_read(>count) != 1);
> 
> This is not safe without the rmb() in between. The results of two correctly
> ordered __exit_signal()->atomic_dec_and_test()'s could be seen out of order
> on our CPU.
> 
> The same is true for the "thread_group_empty()" case, __unhash_process()'s
> changes could be seen before atomic_dec_and_test(>count).
> 
> On some platforms (including i386) atomic_read() doesn't provide even the
> compiler barrier, in that case these checks are simply racy.
> 
> Remove these BUG_ON()'s. Alternatively, we can do something like
> 
>   BUG_ON( ({ smp_rmb(); atomic_read(>count) != 1; }) );

Good catches!

Acked-by: Paul E. McKenney <[EMAIL PROTECTED]>

> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>
> 
> --- t/fs/exec.c~1_BUG_ON  2007-08-18 17:36:58.0 +0400
> +++ t/fs/exec.c   2007-08-18 18:19:41.0 +0400
> @@ -784,7 +784,6 @@ static int de_thread(struct task_struct 
>* and we can just re-use it all.
>*/
>   if (atomic_read(>count) <= 1) {
> - BUG_ON(atomic_read(>count) != 1);
>   signalfd_detach(tsk);
>   exit_itimers(sig);
>   return 0;
> @@ -929,8 +928,6 @@ no_thread_group:
>   if (leader)
>   release_task(leader);
> 
> - BUG_ON(atomic_read(>count) != 1);
> -
>   if (atomic_read(>count) == 1) {
>   /*
>* Now that we nuked the rest of the thread group,
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: skip paravirt patching when appropriate

2007-08-18 Thread Chris Wright

* Chris Wright ([EMAIL PROTECTED]) wrote:
> Now that I understand the problem, I do have a very simple (slightly
> overkill) fix for paravirt patching.  This can be cleaned up to avoid
> the copies when they aren't needed, but that will take a little more
> auditing of the various patchers.  If you still prefer a revert I've
> got one handy, and we can re-visit this all post .23.

This one avoids the patching when necessary, but needs some validation on
VMI (and Zach's not avail today).  I'll resend when I know it's working
for all paravirt patchers.

thanks,
-chris
--

Subject: [PATCH] x86: skip paravirt patching when appropriate
From: Chris Wright <[EMAIL PROTECTED]>

commit d34fda4a84c18402640a1a2342d6e6d9829e6db7 was a little overkill
in the case where a paravirt patcher chooses to leave patch site
unpatched.  Instead of copying original instructions to temp buffer
then back to patch site, simply skip patching those sites altogether.

Cc: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>
Cc: Zach Amsden <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 arch/i386/kernel/alternative.c |4 ++--
 arch/i386/kernel/paravirt.c|   10 +-
 arch/i386/kernel/vmi.c |4 ++--
 include/asm-i386/paravirt.h|3 +++
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/i386/kernel/alternative.c b/arch/i386/kernel/alternative.c
index 9f4ac8b..b81d87e 100644
--- a/arch/i386/kernel/alternative.c
+++ b/arch/i386/kernel/alternative.c
@@ -366,10 +366,10 @@ void apply_paravirt(struct paravirt_patch_site *start,
unsigned int used;
 
BUG_ON(p->len > MAX_PATCH_LEN);
-   /* prep the buffer with the original instructions */
-   memcpy(insnbuf, p->instr, p->len);
used = paravirt_ops.patch(p->instrtype, p->clobbers, insnbuf,
  (unsigned long)p->instr, p->len);
+   if (used == PV_NO_PATCH)
+   continue;
 
BUG_ON(used > p->len);
 
diff --git a/arch/i386/kernel/paravirt.c b/arch/i386/kernel/paravirt.c
index 739cfb2..a36ce34 100644
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -122,7 +122,7 @@ unsigned paravirt_patch_nop(void)
 
 unsigned paravirt_patch_ignore(unsigned len)
 {
-   return len;
+   return PV_NO_PATCH;
 }
 
 struct branch {
@@ -139,9 +139,9 @@ unsigned paravirt_patch_call(void *insnbuf,
unsigned long delta = (unsigned long)target - (addr+5);
 
if (tgt_clobbers & ~site_clobbers)
-   return len; /* target would clobber too much for this site 
*/
+   return PV_NO_PATCH; /* target would clobber too much for 
this site */
if (len < 5)
-   return len; /* call too long for patch site */
+   return PV_NO_PATCH; /* call too long for patch site */
 
b->opcode = 0xe8; /* call */
b->delta = delta;
@@ -157,7 +157,7 @@ unsigned paravirt_patch_jmp(const void *target, void 
*insnbuf,
unsigned long delta = (unsigned long)target - (addr+5);
 
if (len < 5)
-   return len; /* call too long for patch site */
+   return PV_NO_PATCH; /* call too long for patch site */
 
b->opcode = 0xe9;   /* jmp */
b->delta = delta;
@@ -196,7 +196,7 @@ unsigned paravirt_patch_insns(void *insnbuf, unsigned len,
unsigned insn_len = end - start;
 
if (insn_len > len || start == NULL)
-   insn_len = len;
+   insn_len = PV_NO_PATCH;
else
memcpy(insnbuf, start, insn_len);
 
diff --git a/arch/i386/kernel/vmi.c b/arch/i386/kernel/vmi.c
index 18673e0..27ae004 100644
--- a/arch/i386/kernel/vmi.c
+++ b/arch/i386/kernel/vmi.c
@@ -118,7 +118,7 @@ static unsigned patch_internal(int call, unsigned len, void 
*insnbuf,
 
case VMI_RELOCATION_NONE:
/* leave native code in place */
-   break;
+   return PV_NO_PATCH;
 
default:
BUG();
@@ -153,7 +153,7 @@ static unsigned vmi_patch(u8 type, u16 clobbers, void 
*insns,
default:
break;
}
-   return len;
+   return PV_NO_PATCH;
 }
 
 /* CPUID has non-C semantics, and paravirt-ops API doesn't match hardware ISA 
*/
diff --git a/include/asm-i386/paravirt.h b/include/asm-i386/paravirt.h
index 9fa3fa9..b26794f 100644
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -252,6 +252,9 @@ extern struct paravirt_ops paravirt_ops;
 #define paravirt_alt(insn_string)  \
_paravirt_alt(insn_string, "%c[paravirt_typenum]", 
"%c[paravirt_clobber]")
 
+enum {
+   PV_NO_PATCH = -1
+};
 unsigned paravirt_patch_nop(void);
 unsigned paravirt_patch_ignore(unsigned len);
 unsigned

Re: [PATCH] lockdep: annotate rcu_read_{,un}lock()

2007-08-18 Thread Paul E. McKenney

On Fri, Aug 17, 2007 at 01:48:09PM -0500, Corey Minyard wrote:
> Paul E. McKenney wrote:
> >On Fri, Aug 17, 2007 at 09:56:45AM +0200, Peter Zijlstra wrote:
> >  
> >>On Thu, 2007-08-16 at 09:01 -0700, Paul E. McKenney wrote:
> >>
> >>>On Thu, Aug 16, 2007 at 04:25:07PM +0200, Peter Zijlstra wrote:
> >>>  
> There seem to be some unbalanced rcu_read_{,un}lock() issues of late,
> how about doing something like this:
> 
> >>>This will break when rcu_read_lock() and rcu_read_unlock() are invoked
> >>>from NMI/SMI handlers -- the raw_local_irq_save() in lock_acquire() will
> >>>not mask NMIs or SMIs.
> >>>
> >>>One approach would be to check for being in an NMI/SMI handler, and
> >>>to avoid calling lock_acquire() and lock_release() in those cases.
> >>>  
> >>It seems:
> >>
> >>#define nmi_enter() do { lockdep_off(); __irq_enter(); } while 
> >>(0)
> >>#define nmi_exit()  do { __irq_exit(); lockdep_on(); } while (0)
> >>
> >>Should make it all work out just fine. (for NMIs at least, /me fully
> >>ignorant of the workings of SMIs)
> >>
> >
> >Very good point, at least for NMIs on i386 and x86_64.  Can't say that I
> >know much about SMIs myself.  Or about whatever equivalents to NMIs and
> >SMIs might exist on other platforms.  :-/  Of course, the other platforms
> >could be handled by making the RCU lockdep operate only on i386 and x86_64
> >if required.
> >
> >Corey, any advice on SMI handlers?  Is there something like nmi_enter()
> >and nmi_exit() that allows disabing lockdep?
> >  
> You will certainly need something like nmi_enter() and nmi_exit() for 
> SMIs, since they can occur at any time like NMIs.  As far as anything 
> else, you just have to be extremely careful and remember that it can 
> occur anyplace.  But you already know that :).

So we would need to create an smi_enter() and smi_exit() an place them
appropriately.  Any preferences?

> It would be nice if the PowerPC board vendors would tie watchdog 
> pretimeouts and some type of timer into the SMI input.  It would make 
> debugging certain problems much easier.  And all those Marvell bridge 
> chips have a watchdog pretimeout and I haven't seen any board vendor 
> wire it up :(.

Can't say that I have much influence over them, but I must agree
that debuggability is a very good thing!

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [5/12] x86_64: Make patching more robust, fix paravirt issue

2007-08-18 Thread Jeremy Fitzhardinge

Andi Kleen wrote:
>> This patch breaks Xen booting.  
>> 
>
> Check the latest git head. Does it still break?

Yes, that's with latest git head.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Paul E. McKenney

On Fri, Aug 17, 2007 at 06:24:15PM -0700, Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > > >
> > > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > > 
> > > I had totally forgotten that I'd already filed that bug more
> > > than six years ago until they just closed yours as a duplicate
> > > of mine :)
> > > 
> > > Good luck in getting it fixed!
> > 
> > Well, just got done re-opening it for the third time.  And a local
> > gcc community member advised me not to give up too easily.  But I
> > must admit that I am impressed with the speed that it was identified
> > as duplicate.
> > 
> > Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly. It seems that volatile 
> requires that the variable is loaded into a register first and then 
> operated upon. Understandable when you think about volatile being used to 
> access memory mapped I/O registers where a RMW operation could be 
> problematic.
> 
> See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3506

Yep.  The initial reaction was in fact to close my bug as a duplicate
of 3506.  But I was not asking for atomicity, but rather for smaller
code to be generated, so I reopened it.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Paul E. McKenney

On Fri, Aug 17, 2007 at 09:13:35PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
>   if (atomic_read() <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.
> 
> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
>   movl mem,reg
>   cmpl $val,reg
> 
> is any better than
> 
>   cmpl $val,mem
> 
> is just not a very good compiler. But when talking about "volatile", 
> that's exactly what ytou always get (and always have gotten - this is 
> not a regression, and I doubt gcc is alone in this).

One of the gcc guys claimed that he thought that the two-instruction
sequence would be faster on some x86 machines.  I pointed out that
there might be a concern about code size.  I chose not to point out
that people might also care about the other x86 machines.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [5/12] x86_64: Make patching more robust, fix paravirt issue

2007-08-18 Thread Chris Wright

* Linus Torvalds ([EMAIL PROTECTED]) wrote:
> On Sat, 18 Aug 2007, Chris Wright wrote:
> > > 
> > > Check the latest git head. Does it still break?
> > 
> > Yeah, this is the latest git.  The broken commit is Rusty's patch which,
> > after Linus reverted the write-protected remap changes, is no longer
> > necessary.  AFAICT patching is writing garbage into the insn stream.
> > I suspect it's copying an uninitialized temp buffer.
> 
> Can you send me the revert patch that is verified to work?

Now that I understand the problem, I do have a very simple (slightly
overkill) fix for paravirt patching.  This can be cleaned up to avoid
the copies when they aren't needed, but that will take a little more
auditing of the various patchers.  If you still prefer a revert I've
got one handy, and we can re-visit this all post .23.

thanks,
-chris
--

Subject: [PATCH] x86: properly initialize temp insn buffer for paravirt 
patching 
From: Chris Wright <[EMAIL PROTECTED]>

With commit ab144f5ec64c42218a555ec1dbde6b60cf2982d6 the patching code
now collects the complete new instruction stream into a temp buffer
before finally patching in the new insns.  In some cases the paravirt
patchers will choose to leave the patch site unpatched (length mismatch,
clobbers mismatch, etc).  This causes the new patching code to copy an
uninitialized temp buffer, i.e. garbage, to the callsite.  Simply make
sure to always initialize the buffer with the original instruction stream.
A better fix is to audit all the patchers and return proper length so that
apply_paravirt() can skip copies when we leave the patch site untouched.

Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 arch/i386/kernel/alternative.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/alternative.c b/arch/i386/kernel/alternative.c
index 1b66d5c..9f4ac8b 100644
--- a/arch/i386/kernel/alternative.c
+++ b/arch/i386/kernel/alternative.c
@@ -366,6 +366,8 @@ void apply_paravirt(struct paravirt_patch_site *start,
unsigned int used;

BUG_ON(p->len > MAX_PATCH_LEN);
+   /* prep the buffer with the original instructions */
+   memcpy(insnbuf, p->instr, p->len);
used = paravirt_ops.patch(p->instrtype, p->clobbers, insnbuf,
  (unsigned long)p->instr, p->len);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: power off disk drives while running

2007-08-18 Thread Brennan Ashton

On 8/18/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:
>
> On Aug 18 2007 14:22, Robert Hancock wrote:
> >> I see this a a very important feature in the embedded system relm, I
> >> have worked on two projects that required extreme power management,
> >> and massive data storage.  The ability to fully turn off a drive while
> >> the system is running is key. It seems like this should be able to be
> >> done from a kernel point of view rather than extra hardware. Although
> >> if is not in the IDE/SATA spec then extra hardware would be necessary.
> >
> > You can put a drive into sleep mode with ATA commands, that one requires a
> > reset to take it out of that state (as opposed to standby which spins down 
> > but
> > will spin up on any command that's issued afterwards). That's as close as it
> > gets to fully powering off a drive through software.
>
> An IDE reset bringing the disk up again -- that does not sound like
> it is powered down. Power down for me means: as if the plug was pulled.
>
>
> Well, you could also rewrite the standy ioctl to do this:
>
>  - flush data
>  - send spindown request
>  - wait 1ms - 1s (give drive some time to park heads)
>  - outportb(0x378, 0) - poweroff by setting LPT data line to 0
>(who knows? they might control the disk power!)
>
> But you'd still have to fiddle with bringing it up again. That is, you have to
> patch the block or device driver to outportb(0x378, 255) again when something
> is supposed to spin up again.
>
> Oh and of course you have to deal with the problem that all userspace apps may
> hang because they are waiting for the disk.
>
> Also consider that frequently spinning up/down is said to reduce lifetime.
>
>
> Jan
> --
>
In my experience with embedded systems, usually the OS is residing in
flash with all of the apps. The hard disks are used a storage. Example
remote off grid high resolution time laps photography. the device
itself is simple OS, web server, and image grabber all reside in flash
32MB or so. Then HDs are used to store the images and turn off
completely at night and when disk is full.

-- 
Brennan Ashton
Bellingham, Washington

"The box said, 'Requires Windows 98 or better'. So I installed Linux"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: power off disk drives while running

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 14:22, Robert Hancock wrote:
>> I see this a a very important feature in the embedded system relm, I
>> have worked on two projects that required extreme power management,
>> and massive data storage.  The ability to fully turn off a drive while
>> the system is running is key. It seems like this should be able to be
>> done from a kernel point of view rather than extra hardware. Although
>> if is not in the IDE/SATA spec then extra hardware would be necessary.
>
> You can put a drive into sleep mode with ATA commands, that one requires a
> reset to take it out of that state (as opposed to standby which spins down but
> will spin up on any command that's issued afterwards). That's as close as it
> gets to fully powering off a drive through software.

An IDE reset bringing the disk up again -- that does not sound like
it is powered down. Power down for me means: as if the plug was pulled.

Well, you could also rewrite the standy ioctl to do this:

 - flush data
 - send spindown request
 - wait 1ms - 1s (give drive some time to park heads)
 - outportb(0x378, 0) - poweroff by setting LPT data line to 0
   (who knows? they might control the disk power!)

But you'd still have to fiddle with bringing it up again. That is, you have to
patch the block or device driver to outportb(0x378, 255) again when something
is supposed to spin up again.

Oh and of course you have to deal with the problem that all userspace apps may
hang because they are waiting for the disk.

Also consider that frequently spinning up/down is said to reduce lifetime.

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to add debug information into the vmlinux

2007-08-18 Thread Jan Engelhardt


On Aug 18 2007 22:01, Xu Yang wrote:
>
>this vmlinux file is running on my software virtual prototype system.
>and my software enviorment can only load elf file, so I am using this
>real vmlinux file.

Maybe there is a problem in your virtual prototype system (VM?).


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [5/12] x86_64: Make patching more robust, fix paravirt issue

2007-08-18 Thread Linus Torvalds



On Sat, 18 Aug 2007, Chris Wright wrote:
> > 
> > Check the latest git head. Does it still break?
> 
> Yeah, this is the latest git.  The broken commit is Rusty's patch which,
> after Linus reverted the write-protected remap changes, is no longer
> necessary.  AFAICT patching is writing garbage into the insn stream.
> I suspect it's copying an uninitialized temp buffer.

Can you send me the revert patch that is verified to work?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 11/23] Fix m32r __xchg

2007-08-18 Thread Adrian Bunk

On Sun, Aug 12, 2007 at 10:54:45AM -0400, Mathieu Desnoyers wrote:
> the #endif  /* CONFIG_SMP */ should cover the default condition, or it may 
> cause
> bad parameter to be silently missed.
> 
> Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
> CC: [EMAIL PROTECTED]
> CC: [EMAIL PROTECTED]
> ---
>  include/asm-m32r/system.h |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6-lttng/include/asm-m32r/system.h
> ===
> --- linux-2.6-lttng.orig/include/asm-m32r/system.h2007-08-07 
> 14:55:02.0 -0400
> +++ linux-2.6-lttng/include/asm-m32r/system.h 2007-08-07 14:57:57.0 
> -0400
> @@ -189,9 +189,9 @@ __xchg(unsigned long x, volatile void * 
>  #endif   /* CONFIG_CHIP_M32700_TS1 */
>   );
>   break;
> +#endif  /* CONFIG_SMP */
>   default:
>   __xchg_called_with_bad_pointer();
> -#endif  /* CONFIG_SMP */
>   }
>  
>   local_irq_restore(flags);

It seems you never checked whether your patch compiles:

<--  snip  -->

...
  CC  init/main.o
In file included from include2/asm/bitops.h:16,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/include/linux/bitops.h:9,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/include/linux/kernel.h:15,
 from include2/asm/processor.h:16,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/include/linux/prefetch.h:14,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/include/linux/list.h:8,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/include/linux/module.h:9,
 from 
/home/bunk/linux/kernel-2.6/linux-2.6.23-rc2-mm2/init/main.c:13:
include2/asm/system.h: In function '__xchg':
include2/asm/system.h:191: error: implicit declaration of function 
'__xchg_called_with_bad_pointer'
make[2]: *** [init/main.o] Error 1

<--  snip  -->

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [5/12] x86_64: Make patching more robust, fix paravirt issue

2007-08-18 Thread Chris Wright

* Andi Kleen ([EMAIL PROTECTED]) wrote:
> > This patch breaks Xen booting.  
> 
> Check the latest git head. Does it still break?

Yeah, this is the latest git.  The broken commit is Rusty's patch which,
after Linus reverted the write-protected remap changes, is no longer
necessary.  AFAICT patching is writing garbage into the insn stream.
I suspect it's copying an uninitialized temp buffer.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: power off disk drives while running

2007-08-18 Thread Robert Hancock


Brennan Ashton wrote:

On 8/18/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:

On Aug 18 2007 12:08, Marty Leisner wrote:

In embedded system design, it may be useful to poweroff the disks (as opposed
to merely spinning them down).  We want to leave the system running while
the disk is powered down, and let the disk powerup when it needs to be
spun up.

That means you also have to power it on...


While the "power off mechanism" would be platform dependent, is there a
generic path to announce "prepare for power going away"?

I do not see why that would be needed from a software point of view. Just make
sure that the disk does not needlessy emergency-park when pulling power. When
someone wants to write to disk, the request goes to the device driver, which
hands it to the controller, which hands it to the disk. And your controller
should be able to handle it (e.g. wait until reconnect) when there is a request
for a disk that is powered off.


Jan
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



I see this a a very important feature in the embedded system relm, I
have worked on two projects that required extreme power management,
and massive data storage.  The ability to fully turn off a drive while
the system is running is key. It seems like this should be able to be
done from a kernel point of view rather than extra hardware. Although
if is not in the IDE/SATA spec then extra hardware would be necessary.


You can put a drive into sleep mode with ATA commands, that one requires 
a reset to take it out of that state (as opposed to standby which spins 
down but will spin up on any command that's issued afterwards). That's 
as close as it gets to fully powering off a drive through software.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[no subject]

2007-08-18 Thread Conke Hu

subscribe linux-kernel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to add debug information into the vmlinux

2007-08-18 Thread Xu Yang

this vmlinux file is running on my software virtual prototype system.
and my software enviorment can only load elf file, so I am using this
real vmlinux file.

regards,

Yang

2007/8/18, Jan Engelhardt <[EMAIL PROTECTED]>:
>
> On Aug 18 2007 21:49, Xu Yang wrote:
> >
> >I tried as what you told me. and the vmlinux does contain debug
> >information. but the start address of this vmlinux is 0xc0008000. when
> >I tried to run this vmlinux, the program always exit at 0x80a0. I
> >checked out that here is the place mmu is turned on.
> >so I used objcopy --change-addresses 0x4000 to change the start
> >address of the vmlinux, and the program goes further.but the debug
> >information disappeared.
>
> Are you using an UML image... or were you really trying to gdb a real vmlinux?
>
>
>
>Jan
> --
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to add debug information into the vmlinux

2007-08-18 Thread Jan Engelhardt


On Aug 18 2007 21:49, Xu Yang wrote:
>
>I tried as what you told me. and the vmlinux does contain debug
>information. but the start address of this vmlinux is 0xc0008000. when
>I tried to run this vmlinux, the program always exit at 0x80a0. I
>checked out that here is the place mmu is turned on.
>so I used objcopy --change-addresses 0x4000 to change the start
>address of the vmlinux, and the program goes further.but the debug
>information disappeared.

Are you using an UML image... or were you really trying to gdb a real vmlinux?



Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: how to add debug information into the vmlinux

2007-08-18 Thread Xu Yang

Hi Jesper,

I tried as what you told me. and the vmlinux does contain debug
information. but the start address of this vmlinux is 0xc0008000. when
I tried to run this vmlinux, the program always exit at 0x80a0. I
checked out that here is the place mmu is turned on.
so I used objcopy --change-addresses 0x4000 to change the start
address of the vmlinux, and the program goes further.but the debug
information disappeared.

how to handle this? (making the program running and debug information available)

thanks,

regards,

yang

2007/8/17, Jesper Juhl <[EMAIL PROTECTED]>:
> On 17/08/07, Xu Yang <[EMAIL PROTECTED]> wrote:
> > Hello everyone,
> >
> > I am trying to port kernel 2.6.19 onto my system.so I need the c code
> > , which can show me where the program is running. I add -g when I
> > compile it.
> >
> You shouldn't need to do that manually, simply go into "make
> menuconfig", enter the "Kernel hacking" menu and select the "Kernel
> debugging" and "Compile the kernel with debug info" options.
> You may also want to enable "Compile the kernel with frame pointers"
> and various other options in that menu to get more debug info.
>
>
> --
> Jesper Juhl <[EMAIL PROTECTED]>
> Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
> Plain text mails only, please  http://www.expita.com/nomime.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-image-2.6.22-1-amd64: Ethernet not functioning on Nvidia MCP51

2007-08-18 Thread Philippe Bourcier

Package: linux-image-2.6.22-1-amd64
Version: 2.6.22-3
Severity: grave
Justification: renders package unusable


hi all,

  I encountered the same problem than
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg182088.html.

/var/log/dmesg.2.gz:
- = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = -
Linux version 2.6.22-1-amd64 (Debian 2.6.22-3) ([EMAIL PROTECTED]) (gcc version 
4.1.3 20070718 (prerelease) (Debian 4.1.2-14)) #1 SMP Sun Jul 29 13:54:41 UTC 
2007
Command line: root=/dev/sdb2 ro video=nvidiafb:ywrap,mtrr vga=791
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - 77fc (usable)
 BIOS-e820: 77fc - 77fce000 (ACPI data)
 BIOS-e820: 77fce000 - 77ff (ACPI NVS)
 BIOS-e820: 77ff - 7800 (reserved)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fef0 (reserved)
 BIOS-e820: ff78 - 0001 (reserved)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 491456) 1 entries of 3200 used
end_pfn_map = 1048576
DMI 2.4 present.
ACPI: RSDP 000FB5A0, 0014 (r0 ACPIAM)
ACPI: RSDT 77FC, 0030 (r1 A M I  OEMRSDT   8000630 MSFT   97)
ACPI: FACP 77FC0200, 0084 (r2 A M I  OEMFACP   8000630 MSFT   97)
ACPI: DSDT 77FC0440, 6087 (r1  A0588 A05880000 INTL 20060113)
ACPI: FACS 77FCE000, 0040
ACPI: MCFG 77FC0400, 003C (r1 A M I  OEMMCFG   8000630 MSFT   97)
ACPI: OEMB 77FCE040, 0060 (r1 A M I  AMI_OEM   8000630 MSFT   97)
Scanning NUMA topology in Northbridge 24
No NUMA configuration found
Faking a node at -77fc
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 491456) 1 entries of 3200 used
Bootmem setup node 0 -77fc
Zone PFN ranges:
  DMA 0 -> 4096
  DMA324096 ->  1048576
  Normal1048576 ->  1048576
early_node_map[2] active PFN ranges
0:0 ->  159
0:  256 ->   491456
On node 0 totalpages: 491359
  DMA zone: 56 pages used for memmap
  DMA zone: 1020 pages reserved
  DMA zone: 2923 pages, LIFO batch:0
  DMA32 zone: 6663 pages used for memmap
  DMA32 zone: 480697 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
Nvidia board detected. Ignoring ACPI timer override.
If you got timer trouble try acpi_use_timer_override
ACPI: PM-Timer IO Port: 0x508
Intel MultiProcessor Specification v1.4
MPTABLE: OEM ID: ASUS MPTABLE: Product ID:  MPTABLE: APIC at: 0xFEE0
Processor #0 (Bootup-CPU)
Processor #1
I/O APIC #2 at 0xFEC0.
Setting APIC routing to flat
Processors: 2
swsusp: Registered nosave memory region: 0009f000 - 000a
swsusp: Registered nosave memory region: 000a - 000e
swsusp: Registered nosave memory region: 000e - 0010
Allocating PCI resources starting at 8000 (gap: 7800:86c0)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
PERCPU: Allocating 37896 bytes of per cpu data
Built 1 zonelists.  Total pages: 483620
Kernel command line: root=/dev/sdb2 ro video=nvidiafb:ywrap,mtrr vga=791
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 2210.106 MHz processor.
Console: colour dummy device 80x25
Checking aperture...
CPU 0: aperture @ c27000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Memory: 1928104k/1965824k available (2009k kernel code, 37332k reserved, 946k 
data, 296k init)
Calibrating delay using timer specific routine.. 4424.04 BogoMIPS (lpj=8848081)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
ACPI: setting ELCR to 0200 (from 8ca0)
ExtINT not setup in hardware but reported by MP table
Using local APIC timer interrupts.
result 12557423
Detected 12.557 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4420.47 BogoMIPS (lpj=8840954)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
Brought up 2 CPUs
migration_cost=216
NET: Registered

Re: encrypted hibernation (was Re: Hibernation considerations)

2007-08-18 Thread Dr. David Alan Gilbert

* Rafael J. Wysocki ([EMAIL PROTECTED]) wrote:
> On Sunday, 12 August 2007 01:43, Dr. David Alan Gilbert wrote:
> > * Pavel Machek ([EMAIL PROTECTED]) wrote:
> > > Hi!
> > > 
> > > > > > Two things which I think would be nice to consider are:
> > > > > >1) Encryption - I'd actually prefer if my luks device did not
> > > > > >remember the key accross a hibernation;
> 
> Why exactly (assuming that the hibernation image is encrypted)?

I was assuming the hibernation image was not encrypted.
Certainly if it meant a penalty during normal operation (e.g. encrypted swap)
it wouldn't be.

(I have a small amount of encrypted data in a luks partition,
most of the time it isn't used, only rarely do apps have it open
and I'm not actually worried about crawling through swap to find out
what is there - this is just a personal laptop; I appreciate these
concerns are different depending what you are storing).

Dave
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert| Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _|_ http://www.treblig.org   |___/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-image-2.6.22-1-amd64: Ethernet not functioning on Nvidia MCP51

2007-08-18 Thread Philippe Bourcier

Package: linux-image-2.6.22-1-amd64
Version: 2.6.22-3
Severity: grave
Justification: renders package unusable


hi all,

  I encountered the same problem than
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg182088.html.

/var/log/dmesg.2.gz:
- = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = -
Linux version 2.6.22-1-amd64 (Debian 2.6.22-3) ([EMAIL PROTECTED]) (gcc version 
4.1.3 20070718 (prerelease) (Debian 4.1.2-14)) #1 SMP Sun Jul 29 13:54:41 UTC 
2007
Command line: root=/dev/sdb2 ro video=nvidiafb:ywrap,mtrr vga=791
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - 77fc (usable)
 BIOS-e820: 77fc - 77fce000 (ACPI data)
 BIOS-e820: 77fce000 - 77ff (ACPI NVS)
 BIOS-e820: 77ff - 7800 (reserved)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fef0 (reserved)
 BIOS-e820: ff78 - 0001 (reserved)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 491456) 1 entries of 3200 used
end_pfn_map = 1048576
DMI 2.4 present.
ACPI: RSDP 000FB5A0, 0014 (r0 ACPIAM)
ACPI: RSDT 77FC, 0030 (r1 A M I  OEMRSDT   8000630 MSFT   97)
ACPI: FACP 77FC0200, 0084 (r2 A M I  OEMFACP   8000630 MSFT   97)
ACPI: DSDT 77FC0440, 6087 (r1  A0588 A05880000 INTL 20060113)
ACPI: FACS 77FCE000, 0040
ACPI: MCFG 77FC0400, 003C (r1 A M I  OEMMCFG   8000630 MSFT   97)
ACPI: OEMB 77FCE040, 0060 (r1 A M I  AMI_OEM   8000630 MSFT   97)
Scanning NUMA topology in Northbridge 24
No NUMA configuration found
Faking a node at -77fc
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 491456) 1 entries of 3200 used
Bootmem setup node 0 -77fc
Zone PFN ranges:
  DMA 0 -> 4096
  DMA324096 ->  1048576
  Normal1048576 ->  1048576
early_node_map[2] active PFN ranges
0:0 ->  159
0:  256 ->   491456
On node 0 totalpages: 491359
  DMA zone: 56 pages used for memmap
  DMA zone: 1020 pages reserved
  DMA zone: 2923 pages, LIFO batch:0
  DMA32 zone: 6663 pages used for memmap
  DMA32 zone: 480697 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
Nvidia board detected. Ignoring ACPI timer override.
If you got timer trouble try acpi_use_timer_override
ACPI: PM-Timer IO Port: 0x508
Intel MultiProcessor Specification v1.4
MPTABLE: OEM ID: ASUS MPTABLE: Product ID:  MPTABLE: APIC at: 0xFEE0
Processor #0 (Bootup-CPU)
Processor #1
I/O APIC #2 at 0xFEC0.
Setting APIC routing to flat
Processors: 2
swsusp: Registered nosave memory region: 0009f000 - 000a
swsusp: Registered nosave memory region: 000a - 000e
swsusp: Registered nosave memory region: 000e - 0010
Allocating PCI resources starting at 8000 (gap: 7800:86c0)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
PERCPU: Allocating 37896 bytes of per cpu data
Built 1 zonelists.  Total pages: 483620
Kernel command line: root=/dev/sdb2 ro video=nvidiafb:ywrap,mtrr vga=791
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 2210.106 MHz processor.
Console: colour dummy device 80x25
Checking aperture...
CPU 0: aperture @ c27000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Memory: 1928104k/1965824k available (2009k kernel code, 37332k reserved, 946k 
data, 296k init)
Calibrating delay using timer specific routine.. 4424.04 BogoMIPS (lpj=8848081)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
ACPI: setting ELCR to 0200 (from 8ca0)
ExtINT not setup in hardware but reported by MP table
Using local APIC timer interrupts.
result 12557423
Detected 12.557 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4420.47 BogoMIPS (lpj=8840954)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping 01
Brought up 2 CPUs
migration_cost=216
NET: Registered

Re: [linux-usb-devel] why was MODALIAS removed from usb kernel events? [u]

2007-08-18 Thread Andreas Jellinghaus [c]

Am Freitag, 17. August 2007 schrieben Sie:
> On Fri, 17 Aug 2007, Andreas Jellinghaus [c] wrote:
> > I need some kernel event that has both DEVICE and MODALIAS set.
> > up to including kernel 2.6.21 this seems to come from
> > drivers/usb/core/driver.c if I read the code correctly, and then
> > it was removed.
> >
> > udevmonitor --kernel --environment shows one event with both on 2.6.21
> > plain, but not on 2.6.22 plain.
>
> Does this patch improve matters?

yes, thanks, it does. can you add PRODUCT as well?
remembered too late that my scripts at least use it too.
but that one is no big issue, MODALIAS has the same information
so I could use that information. but other people might use it.

Regards, Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: power off disk drives while running

2007-08-18 Thread Brennan Ashton

On 8/18/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:
>
> On Aug 18 2007 12:08, Marty Leisner wrote:
> >
> >In embedded system design, it may be useful to poweroff the disks (as opposed
> >to merely spinning them down).  We want to leave the system running while
> >the disk is powered down, and let the disk powerup when it needs to be
> >spun up.
>
> That means you also have to power it on...
>
> >While the "power off mechanism" would be platform dependent, is there a
> >generic path to announce "prepare for power going away"?
>
> I do not see why that would be needed from a software point of view. Just make
> sure that the disk does not needlessy emergency-park when pulling power. When
> someone wants to write to disk, the request goes to the device driver, which
> hands it to the controller, which hands it to the disk. And your controller
> should be able to handle it (e.g. wait until reconnect) when there is a 
> request
> for a disk that is powered off.
>
>
> Jan
> --
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

I see this a a very important feature in the embedded system relm, I
have worked on two projects that required extreme power management,
and massive data storage.  The ability to fully turn off a drive while
the system is running is key. It seems like this should be able to be
done from a kernel point of view rather than extra hardware. Although
if is not in the IDE/SATA spec then extra hardware would be necessary.
-- 
Brennan Ashton
Bellingham, Washington

"The box said, 'Requires Windows 98 or better'. So I installed Linux"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/12] Blackfin arch: Add label to call new GPIO API

2007-08-18 Thread Robin Getz

On Fri 17 Aug 2007 18:34, David Brownell pondered:
> On Friday 17 August 2007, Robin Getz wrote:
> > On Fri 17 Aug 2007 14:24, David Brownell pondered:
> > > Just for the record, this is an unusual way to use these calls.
> > 
> > That is part of the natural evolution of the kernel isn't it - per
> > James's keynote at OLS - you release something, and see how people 
> > [ab]use it until it either grows, evolves, or it dies.
> 
> Yep ... and it's worth knowing when you're doing
> something different.  Different isn't always worse,
> isn't always better.

No disagreements here.

> > > Other platforms completely decouple these issues from the
> > > IRQ infrastructure ... doing the pinmux and gpio claiming
> > > separately from the request_irq()/free_irq() paths, mostly
> > > as part of board setup.  Doing all of that "early":
> > 
> > is early:
> >  - early in the kernel?
> >  - early before the kernel? (in the bootloader).
> 
> Both of those are "earlier", yes.  Different product developers
> may argue for either placement.

Just like we say things are better/easier for us later.

> > >  - keeps those error returns from causing hard-to-track-down
> > >runtime bugs;
> > 
> > The current Blackfin implementation causes a run time message:
> > "the pin  driver requested, was already claimed by yyy driver".
> > 
> > I don't think that is too bad?
> 
> Given some product with a Blackfin chip, would you expect a
> customer -- who may not even see the Linux bits!! -- to be
> able to solve such problems?  If it's not possible for such
> problems to crop up in the field, product support (and field
> troubleshooting) gets easier...

Typically customers who are not familiar with the linux bits are not doing 
modprobe either...

I don't see how early/late makes the problem easier/worse to debug. No matter 
when you do it - the driver refuses to install (or at least should).

> > >  - works always, even on platforms where a given IRQ may
> > >appear on any of several pins/balls;
> > 
> > But requires custom bootloaders or board setup for every hardware
> > platform?
> 
> One or both, yes.  That's typical in embedded setups.
> They're not necessarily all that different, but that
> code does need to handle the hardware differences.

Right - for us - the code handing the hardware differences is easier in the 
drivers, rather than the bootloaders.

For other systems - where you can have a UART on any pin - I completely 
understand your point.

> > Most of our users would not like that, since they do as you say - use
> > the same kernel - with different drivers on multiple platforms.
> 
> I thought I referred to different revisions of one platform... :)

You did - I was just saying that some of our customers don't do it the way you 
were thinking.

> > >  - makes it easier to cross-check against board schematics,
> > >by keeping most board-specific setup in one source file;
> > 
> > Yes - but we are not talking about muxing a common peripheral (like a
> > single UART) out many different pins (A or B or C). The UART pins are
> > fixed. If you want the UART, you need to use pin A. If you want to use
> > the I2C that also sits on pin A, you will get the message:
> > "pin A, requested by I2C, was already claimed by UART driver".
> 
> Not all platforms work that way though.  There can often be several
> options for where a given signal gets routed.

But this is how it _always_ works on Blackfin. For other systems - like ARM, 
where n+1 silicon manufactures are all implementing things differently - I 
can understand your comments.


> > >  - allows the label to be more descriptive ... describeing
> > >exactly *which* IRQ, so that using the labels for better
> > >diagnostics actually gives better diagnostics.
> > 
> > I'm not sure what you mean?
> 
> The $SUBJECT patch uses the string "IRQ" in all cases.
> But "smc_irq" and "codec_irq" would be more informative
> as entries in a list of even just a handful of GPIOs.
> And with a few dozen, I'd find "IRQ" not at all helpful.

I agree - things can always be more descriptive.

> > > Again, not "wrong"; but probably sub-optimal.  You might
> > > want to move towards earlier binding now, while Linux is
> > > still young on Blackfin and you don't have legacy code to
> > > worry about.
> > 
> > Our overall goal is to keep as much code - including bootloader -
> > platform  agnostic, and not require people to write any of
> > code/configuration data to boot up something, and get things
> > working in a semi-standard manner. 
> 
> The issue is just where those limits lie.  IMO it's not at
> all unreasonable to require board-specific code.  External
> chips will need board-specfic glue data in most cases (how
> they're addressed, what IRQs they use, and so on); and you
> may have drivers available that correspond to devices that
> are not wired up on that particular hardware.
> 
> 
> > This still has it's limits - which is why we publish all our hardware
> > designs.

Re: + proc-export-a-processes-resource-limits-via-proc-pid.patch added to -mm tree

2007-08-18 Thread Oleg Nesterov

On 08/18, Neil Horman wrote:
>
> +static int proc_pid_limits(struct task_struct *task, char *buffer)
> +{
> + unsigned int i;
> + int count = 0;
> + unsigned long flags;
> + char *bufptr = buffer;
> +
> + struct rlimit rlim[RLIM_NLIMITS];
> +
> + read_lock(_lock);
> + lock_task_sighand(task, );
> + if (task->signal == NULL){
> + unlock_task_sighand(task, );
> + read_unlock(_lock);

Neil, please, don't add tasklist_lock again. It was not easy to wipe it from
fs/proc/ :) Just change this code to use rcu_read_lock().

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [1/4] x86_64: Fail dma_alloc_coherent on dma less devices

2007-08-18 Thread Andi Kleen

On Saturday 18 August 2007 19:22:18 Linus Torvalds wrote:
> 
> Hmm. I think this is wrong. 
> 
> Why? Because the regular 32-bit x86 code does this all completely 
> differently, and doesn't use dma_mask at all. Instead, it _only_ uses 
> dev->coherent_dma_mask (which, considering the name of the function, 
> would seem to make sense).

Yes, see my discussion with Alan. Likely this needs to be handled
in the caller to really fix the problem (pata_pcmcia oopsing when
it passes a DMA incapable pcmcia device to dma_alloc_coherent) 

Another possible fix proposed by James was to give the pcmcia
devices a dma_mask of 0 which would also work.

> Considering that the oops comes from this:
> 
> /* Kludge to make it bug-to-bug compatible with i386. i386
>uses the normal dma_mask for alloc_coherent. */
> dma_mask &= *dev->dma_mask;

> 
> and that that code is *old*, and comes from when this file was called 
> arch/x86_64/kernel/pci-gart.c, and the comment doesn't seem to even be 

It might be outdated or it might now. The kludge was needed for Alsa because old
i386 ignored the consistent mask and they didn't always set it correctly, but 
that 
should be obsolete now? I'm not quite sure because sound devices
are not always well tested on large memory systems which are the only
ones who show this problem. Takashi, do you know if all alsa drivers
set consistent mask correctly now?

But I didn't want to touch that that late -- can do it for .24.

Still failing is probably correct in this case so I included the patch.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + proc-export-a-processes-resource-limits-via-proc-pid.patch added to -mm tree

2007-08-18 Thread Neil Horman


Ok, I think I see your point.  Thanks for the input.  New patch attached which
adds the use of the sighand lock to the patch.


Currently, there exists no method for a process to query the resource
limits of another process.  They can be inferred via some mechanisms but
they cannot be explicitly determined.  Given that this information can be
usefull to know during the debugging of an application, I've written this
patch which exports all of a processes limits via /proc//limits.

Regards
Neil

Signed-off-by: Neil Horman <[EMAIL PROTECTED]>


 base.c |   77 +
 1 file changed, 77 insertions(+)



diff --git a/fs/proc/base.c b/fs/proc/base.c
index ed2b224..86130b0 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 
 /* NOTE:
@@ -323,6 +324,80 @@ static int proc_oom_score(struct task_struct *task, char 
*buffer)
return sprintf(buffer, "%lu\n", points);
 }
 
+struct limit_names {
+   char *name;
+   char *unit;
+};
+
+static const struct limit_names lnames[RLIM_NLIMITS] = {
+   [RLIMIT_CPU] = {"Max cpu time", "ms"},
+   [RLIMIT_FSIZE] = {"Max file size", "bytes"},
+   [RLIMIT_DATA] = {"Max data size", "bytes"},
+   [RLIMIT_STACK] = {"Max stack size", "bytes"},
+   [RLIMIT_CORE] = {"Max core file size", "bytes"},
+   [RLIMIT_RSS] = {"Max resident set", "bytes"},
+   [RLIMIT_NPROC] = {"Max processes", "processes"},
+   [RLIMIT_NOFILE] = {"Max open files", "files"},
+   [RLIMIT_MEMLOCK] = {"Max locked memory", "bytes"},
+   [RLIMIT_AS] = {"Max address space", "bytes"},
+   [RLIMIT_LOCKS] = {"Max file locks", "locks"},
+   [RLIMIT_SIGPENDING] = {"Max pending signals", "signals"},
+   [RLIMIT_MSGQUEUE] = {"Max msgqueue size", "bytes"},
+   [RLIMIT_NICE] = {"Max nice priority", NULL},
+   [RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
+};
+
+/* Display limits for a process */
+static int proc_pid_limits(struct task_struct *task, char *buffer)
+{
+   unsigned int i;
+   int count = 0;
+   unsigned long flags;
+   char *bufptr = buffer;
+
+   struct rlimit rlim[RLIM_NLIMITS];
+
+   read_lock(_lock);
+   lock_task_sighand(task, );
+   if (task->signal == NULL){
+   unlock_task_sighand(task, );
+   read_unlock(_lock);
+   return 0;
+   }
+   memcpy(rlim, task->signal->rlim, sizeof(struct rlimit) * RLIM_NLIMITS);
+   unlock_task_sighand(task, );
+   read_unlock(_lock);
+
+   /*
+* print the file header
+*/
+   count += sprintf([count], "%-25s %-20s %-20s %-10s\n",
+   "Limit", "Soft Limit", "Hard Limit", "Units");
+
+   for (i = 0; i < RLIM_NLIMITS; i++) {
+   if (rlim[i].rlim_cur == RLIM_INFINITY)
+   count += sprintf([count], "%-25s %-20s ",
+lnames[i].name, "unlimited");
+   else
+   count += sprintf([count], "%-25s %-20lu ",
+lnames[i].name, rlim[i].rlim_cur);
+
+   if (rlim[i].rlim_max == RLIM_INFINITY)
+   count += sprintf([count], "%-20s ", "unlimited");
+   else
+   count += sprintf([count], "%-20lu ",
+rlim[i].rlim_max);
+
+   if (lnames[i].unit)
+   count += sprintf([count], "%-10s\n",
+lnames[i].unit);
+   else
+   count += sprintf([count], "\n");
+   }
+
+   return count;
+}
+
 //
 /*   Here the fs part begins*/
 //
@@ -2017,6 +2092,7 @@ static const struct pid_entry tgid_base_stuff[] = {
INF("environ",S_IRUSR, pid_environ),
INF("auxv",   S_IRUSR, pid_auxv),
INF("status", S_IRUGO, pid_status),
+   INF("limits", S_IRUSR, pid_limits),
 #ifdef CONFIG_SCHED_DEBUG
REG("sched",  S_IRUGO|S_IWUSR, pid_sched),
 #endif
@@ -2310,6 +2386,7 @@ static const struct pid_entry tid_base_stuff[] = {
INF("environ",   S_IRUSR, pid_environ),
INF("auxv",  S_IRUSR, pid_auxv),
INF("status",S_IRUGO, pid_status),
+   INF("limits",S_IRUSR, pid_limits),
 #ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, pid_sched),
 #endif

-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the

Re: [PATCH] x86-64: memset optimization

2007-08-18 Thread Andi Kleen


> The problem is that on x86-64 you are overriding memset() 

I don't.  You must be looking at old source

asm-x86_64/string.h 2.6.23rc3:

#define __HAVE_ARCH_MEMSET
void *memset(void *s, int c, size_t n);

I wanted to do the same on i386 too, but there were some minor obstacles.
The problem is that the out of line fallback i386 memset is currently
quite dumb and needs to be rewritten to expand the fill char on its
own like the x86-64 version. Probably best would be just to port
the x86-64 version. I just hadn't had time for that.

[Patches welcome, but if you do ask me for my old memset test harness]

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread Al Viro

On Sat, Aug 18, 2007 at 09:45:54AM -0700, Marc Perkel wrote:

> Linux isn't going to make progress when people try to
> figure out how to make something NOT work rather than
> to make something work. So if you are going to put
> effort into this then why not try to figure out how to
> get around the issues you are raising rather than to
> attack the idea as unsolvable.

It's your idea; _you_ get to defend it against the problems found by
reviewers.  And whining about negativity is the wrong way to do that.
Look at it that way: there is science and there is feel-good woo.
The former depends on peer review.  The latter depends on not having
it and vague handwaving is the classical way of avoiding it.  So are
the claims of being a "visionary" and accusing critics of being uncooperative
reactionaries conspiring against the progress.

So far you are doing very poorly; if you want somebody else to join
you in experimenting with these ideas, you are acting in a very
inefficient way (and if you don't want anybody else, you'll obviously
have to deal with details yourself anyway).  Asserting that critics
should patch the holes in your handwaving is unlikely to impress anybody;
arrogance is not in short supply around here and yours is not even
original.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] include linux/types.h in if_fddi.h

2007-08-18 Thread Adrian Bunk

On Thu, Aug 09, 2007 at 05:36:39PM +0100, Maciej W. Rozycki wrote:
> On Thu, 9 Aug 2007, Olaf Hering wrote:
> 
> > include/linux/if_fddi.h is an exported header.
> > It uses __be16. Include linux/types.h to get this prototype.
> 
>  Please note that for userland it does not matter.  With glibc you should 
> include  which does the necessary bits before including 
> .  Any other C library should likely take a similar 
> approach.

It doesn't make any sense to habe all libc's figure out inter-header 
dependencies.

For each userspace exported header file an #include  should 
always compile, and if it fails due to missing #include's in the header 
that's a bug that should be fixed.

>  It still seems right for Linux itself though.
> 
>   Maciej

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/9] define global BIT macro

2007-08-18 Thread Randy Dunlap


Jiri Slaby wrote:

Randy Dunlap napsal(a):

On Sat, 18 Aug 2007 11:44:12 +0200 (CEST) Jiri Slaby wrote:


define global BIT macro

move all local BIT defines to the new globally define macro.

Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>

---

 include/linux/bitops.h  |1 +
 include/video/sstfb.h   |1 -
 include/video/tdfx.h|2 --
 net/mac80211/ieee80211_i.h  |2 --
 18 files changed, 1 insertions(+), 37 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 3255b06..a57b81f 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -3,6 +3,7 @@
 #include 
 
 #ifdef	__KERNEL__

+#define BIT(nr)(1UL << (nr))
 #define BIT_MASK(nr)   (1UL << ((nr) % BITS_PER_LONG))
 #define BIT_WORD(nr)   ((nr) / BITS_PER_LONG)
 #define BITS_TO_TYPE(nr, t)(((nr)+(t)-1)/(t))


So users of the BIT() macro in include/linux/input.h can be
changed to use the global BIT_MASK() macro...
and the former can be removed.


I'm afraid I don't understand you. Maybe, you are writing about changes done in
patch no. 7 [1], which didn't go through to the lkml?

[1]
http://www.fi.muni.cz/~xslaby/sklad/07-get-rid-of-input-bit-duplicate-defines.patch


Exactly.  Thanks.

--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

tracking MAINTAINERS versus tracking SUBSYSTEMS

2007-08-18 Thread Robert P. J. Day


  this latest project of cramming the full definition of each kernel
subsystem into the MAINTAINERS file has been bothering me, and i've
finally figured out why.  it's because the MAINTAINERS file is being
asked to now be the source of reference information that just doesn't
match its name.  it's the "MAINTAINERS" file so it seems that all it
should be keeping track of is the maintainer of each subsystem, and
that's all.

  what people are clearly after is a way to match any part of the
kernel source to a subsystem and, henceforth, to a maintainer, but
there's nothing that says all of that has to be crammed into *that*
file.

  why not add a new script to the kernel source tree that, given a
file or directory name as an argument, returns its corresponding
"subsystem" that can be cross-referenced against the MAINTAINERS file?
something like:

  $ show_subsystem drivers/bluetooth/bpa10x.c
  BLUETOOTH

there would seem to be a number of advantages to this approach:

1) it reduces the MAINTAINERS file back to what it should be in the
first place -- a simple reference list of each kernel subsystem, and
who's responsible for it, so that constant reshuffling of files or
directories in a particular subsystem doesn't require constant
updating of the MAINTAINERS file.

2) you could extend the show_subsystem() routine to, once it found the
subsystem, quickly cross-reference the MAINTAINERS file and print out
the corresponding maintainer.  i believe the word "trivial" applies
here.

3) by making this a feature separate from the MAINTAINERS file, it can
be mocked up and hacked separately and finally patched in when it's
ready to go, rather than applying 5 bazillion patches to the poor
MAINTAINERS file.

4) finally, a feature like this could be used as a sanity check on the
kernel subsystem structure.  every once in a while, it could be
invoked for every single file and directory in the tree, just to see
if all of those appear to belong to at least one subsystem.  if not,
print a warning:  "Whoa, file /fubar/snafu doesn't belong to a
subsystem.  Deal with it."

  the actual implementation would seem to be easy -- perhaps a simple
text file that defines each subsystem and every file and directory
that's part of it:

FIREWIRE:drivers/firewire/,include/linux/firewire.h,... etc ...

i mean, it doesn't get a whole lot simpler than that, and it would
seem to be *way* easier to read.

  thoughts?

rday

-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] exec: consolidate 2 fast-paths

2007-08-18 Thread Oleg Nesterov

Now that we don't pre-allocate the new ->sighand, we can kill the first fast
path, it doesn't make sense any longer. At best, it can save one "list_empty()"
check but leads to the code duplication.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/fs/exec.c~4_FASTPATH  2007-08-18 19:10:58.0 +0400
+++ t/fs/exec.c 2007-08-18 19:34:12.0 +0400
@@ -779,16 +779,6 @@ static int de_thread(struct task_struct 
struct task_struct *leader = NULL;
int count;
 
-   /*
-* If we don't share sighandlers, then we aren't sharing anything
-* and we can just re-use it all.
-*/
-   if (atomic_read(>count) <= 1) {
-   signalfd_detach(tsk);
-   exit_itimers(sig);
-   return 0;
-   }
-
if (thread_group_empty(tsk))
goto no_thread_group;
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC,PATCH 5/5] exec: RT sub-thread can livelock and monopolize CPU on exec

2007-08-18 Thread Oleg Nesterov

de_thread() yields waiting for ->group_leader to be a zombie. This deadlocks
if an rt-prio execer shares the same cpu with ->group_leader. Change the code
to use ->group_exit_task/notify_count mechanics.

This patch certainly uglifies the code, perhaps someone can suggest something
better.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/kernel/exit.c~5_DEADLOCK  2007-08-17 18:56:31.0 +0400
+++ t/kernel/exit.c 2007-08-18 20:57:49.0 +0400
@@ -102,10 +102,9 @@ static void __exit_signal(struct task_st
 * If there is any task waiting for the group exit
 * then notify it:
 */
-   if (sig->group_exit_task && atomic_read(>count) == 
sig->notify_count) {
+   if (sig->group_exit_task && atomic_read(>count) == 
sig->notify_count)
wake_up_process(sig->group_exit_task);
-   sig->group_exit_task = NULL;
-   }
+
if (tsk == sig->curr_target)
sig->curr_target = next_thread(tsk);
/*
@@ -850,6 +849,11 @@ static void exit_notify(struct task_stru
state = EXIT_DEAD;
tsk->exit_state = state;
 
+   if (thread_group_leader(tsk) &&
+   tsk->signal->notify_count < 0 &&
+   tsk->signal->group_exit_task)
+   wake_up_process(tsk->signal->group_exit_task);
+
write_unlock_irq(_lock);
 
list_for_each_safe(_p, _n, _dead) {
--- t/fs/exec.c~5_DEADLOCK  2007-08-18 19:34:12.0 +0400
+++ t/fs/exec.c 2007-08-18 20:43:59.0 +0400
@@ -828,16 +828,15 @@ static int de_thread(struct task_struct 
hrtimer_restart(>real_timer);
spin_lock_irq(lock);
}
+
+   sig->notify_count = count;
+   sig->group_exit_task = tsk;
while (atomic_read(>count) > count) {
-   sig->group_exit_task = tsk;
-   sig->notify_count = count;
__set_current_state(TASK_UNINTERRUPTIBLE);
spin_unlock_irq(lock);
schedule();
spin_lock_irq(lock);
}
-   sig->group_exit_task = NULL;
-   sig->notify_count = 0;
spin_unlock_irq(lock);
 
/*
@@ -846,14 +845,17 @@ static int de_thread(struct task_struct 
 * and to assume its PID:
 */
if (!thread_group_leader(tsk)) {
-   /*
-* Wait for the thread group leader to be a zombie.
-* It should already be zombie at this point, most
-* of the time.
-*/
leader = tsk->group_leader;
-   while (leader->exit_state != EXIT_ZOMBIE)
-   yield();
+
+   sig->notify_count = -1;
+   for (;;) {
+   write_lock_irq(_lock);
+   if (likely(leader->exit_state))
+   break;
+   __set_current_state(TASK_UNINTERRUPTIBLE);
+   write_unlock_irq(_lock);
+   schedule();
+   }
 
/*
 * The only record we have of the real-time age of a
@@ -867,8 +869,6 @@ static int de_thread(struct task_struct 
 */
tsk->start_time = leader->start_time;
 
-   write_lock_irq(_lock);
-
BUG_ON(leader->tgid != tsk->tgid);
BUG_ON(tsk->pid == tsk->tgid);
/*
@@ -901,6 +901,8 @@ static int de_thread(struct task_struct 
write_unlock_irq(_lock);
 }
 
+   sig->group_exit_task = NULL;
+   sig->notify_count = 0;
/*
 * There may be one thread left which is just exiting,
 * but it's safe to stop telling the group to kill themselves.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] exec: simplify the new ->sighand allocation

2007-08-18 Thread Oleg Nesterov

de_thread() pre-allocates newsighand to make sure that exec() can't fail after
killing all sub-threads. Imho, this buys nothing, but complicates the code:

- this is (mostly) needed to handle CLONE_SIGHAND without CLONE_THREAD
  tasks, this is very unlikely (if ever used) case

- unless we already have some serious problems, GFP_KERNEL allocation
  should not fail

- ENOMEM still can happen after de_thread(), ->sighand is not the last
  object we have to allocate

Change the code to allocate the new ->sighand on demand.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/fs/exec.c~3_ALLOC 2007-08-18 18:26:48.0 +0400
+++ t/fs/exec.c 2007-08-18 19:10:58.0 +0400
@@ -774,7 +774,7 @@ static int exec_mmap(struct mm_struct *m
 static int de_thread(struct task_struct *tsk)
 {
struct signal_struct *sig = tsk->signal;
-   struct sighand_struct *newsighand, *oldsighand = tsk->sighand;
+   struct sighand_struct *oldsighand = tsk->sighand;
spinlock_t *lock = >siglock;
struct task_struct *leader = NULL;
int count;
@@ -789,10 +789,6 @@ static int de_thread(struct task_struct 
return 0;
}
 
-   newsighand = kmem_cache_alloc(sighand_cachep, GFP_KERNEL);
-   if (!newsighand)
-   return -ENOMEM;
-
if (thread_group_empty(tsk))
goto no_thread_group;
 
@@ -809,7 +805,6 @@ static int de_thread(struct task_struct 
 */
spin_unlock_irq(lock);
read_unlock(_lock);
-   kmem_cache_free(sighand_cachep, newsighand);
return -EAGAIN;
}
 
@@ -928,17 +923,16 @@ no_thread_group:
if (leader)
release_task(leader);
 
-   if (atomic_read(>count) == 1) {
-   /*
-* Now that we nuked the rest of the thread group,
-* it turns out we are not sharing sighand any more either.
-* So we can just keep it.
-*/
-   kmem_cache_free(sighand_cachep, newsighand);
-   } else {
+   if (atomic_read(>count) != 1) {
+   struct sighand_struct *newsighand;
/*
-* Move our state over to newsighand and switch it in.
+* This ->sighand is shared with the CLONE_SIGHAND
+* but not CLONE_THREAD task, switch to the new one.
 */
+   newsighand = kmem_cache_alloc(sighand_cachep, GFP_KERNEL);
+   if (!newsighand)
+   return -ENOMEM;
+
atomic_set(>count, 1);
memcpy(newsighand->action, oldsighand->action,
   sizeof(newsighand->action));

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] exec: kill unsafe BUG_ON(sig->count) checks

2007-08-18 Thread Oleg Nesterov

de_thread:

if (atomic_read(>count) <= 1)
BUG_ON(atomic_read(>count) != 1);

This is not safe without the rmb() in between. The results of two correctly
ordered __exit_signal()->atomic_dec_and_test()'s could be seen out of order
on our CPU.

The same is true for the "thread_group_empty()" case, __unhash_process()'s
changes could be seen before atomic_dec_and_test(>count).

On some platforms (including i386) atomic_read() doesn't provide even the
compiler barrier, in that case these checks are simply racy.

Remove these BUG_ON()'s. Alternatively, we can do something like

BUG_ON( ({ smp_rmb(); atomic_read(>count) != 1; }) );

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/fs/exec.c~1_BUG_ON2007-08-18 17:36:58.0 +0400
+++ t/fs/exec.c 2007-08-18 18:19:41.0 +0400
@@ -784,7 +784,6 @@ static int de_thread(struct task_struct 
 * and we can just re-use it all.
 */
if (atomic_read(>count) <= 1) {
-   BUG_ON(atomic_read(>count) != 1);
signalfd_detach(tsk);
exit_itimers(sig);
return 0;
@@ -929,8 +928,6 @@ no_thread_group:
if (leader)
release_task(leader);
 
-   BUG_ON(atomic_read(>count) != 1);
-
if (atomic_read(>count) == 1) {
/*
 * Now that we nuked the rest of the thread group,

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] exec: simplify ->sighand switching

2007-08-18 Thread Oleg Nesterov

There is no any reason to do recalc_sigpending() after changing ->sighand.
To begin with, recalc_sigpending() does not take ->sighand into account.

This means we don't need to take newsighand->siglock while changing sighands.
rcu_assign_pointer() provides a necessary barrier, and if another process
reads the new ->sighand it should either take tasklist_lock or it should use
lock_task_sighand() which has a corresponding smp_read_barrier_depends().

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- t/fs/exec.c~2_SWITCH2007-08-18 18:19:41.0 +0400
+++ t/fs/exec.c 2007-08-18 18:26:48.0 +0400
@@ -945,12 +945,7 @@ no_thread_group:
 
write_lock_irq(_lock);
spin_lock(>siglock);
-   spin_lock_nested(>siglock, SINGLE_DEPTH_NESTING);
-
rcu_assign_pointer(tsk->sighand, newsighand);
-   recalc_sigpending();
-
-   spin_unlock(>siglock);
spin_unlock(>siglock);
write_unlock_irq(_lock);
 
@@ -960,12 +955,11 @@ no_thread_group:
BUG_ON(!thread_group_leader(tsk));
return 0;
 }
-   
+
 /*
  * These functions flushes out all traces of the currently running executable
  * so that a new one can be started
  */
-
 static void flush_old_files(struct files_struct * files)
 {
long j = -1;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [1/4] x86_64: Fail dma_alloc_coherent on dma less devices

2007-08-18 Thread Linus Torvalds

Hmm. I think this is wrong. 

Why? Because the regular 32-bit x86 code does this all completely 
differently, and doesn't use dma_mask at all. Instead, it _only_ uses 
dev->coherent_dma_mask (which, considering the name of the function, 
would seem to make sense).

Considering that the oops comes from this:

/* Kludge to make it bug-to-bug compatible with i386. i386
   uses the normal dma_mask for alloc_coherent. */
dma_mask &= *dev->dma_mask;

and that that code is *old*, and comes from when this file was called 
arch/x86_64/kernel/pci-gart.c, and the comment doesn't seem to even be 
correct any more, I really think the proper fix is likely to just *remove* 
that kludge that causes the oops entirely.

Anyway, I'll apply the patch, because clearly it's not going to make 
things *worse* (it does avoid the oops that we get now), but I don't think 
it's really even "probably still the right thing to do". I really think we 
should just remove the line that causes the oops instead, but that might 
change behaviour for non-oopsing cases, so I'm not ready to do that at 
this point.

Hmm? Who feels in charge of the DMA mapping stuff? Muli? James? Anybody?

Linus

On Wed, 15 Aug 2007, Andi Kleen wrote:
> 
> This should fix an oops with PCMCIA PATA devices
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=8424
> 
> This is not a full fix for the problem, but probably
> still the right thing to do.
> 
> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>
> 
> ---
>  arch/x86_64/kernel/pci-dma.c |4 
>  1 file changed, 4 insertions(+)
> 
> Index: linux/arch/x86_64/kernel/pci-dma.c
> ===
> --- linux.orig/arch/x86_64/kernel/pci-dma.c
> +++ linux/arch/x86_64/kernel/pci-dma.c
> @@ -82,6 +82,10 @@ dma_alloc_coherent(struct device *dev, s
>   if (dma_mask == 0)
>   dma_mask = DMA_32BIT_MASK;
>  
> + /* Device not DMA able */
> + if (dev->dma_mask == NULL)
> + return NULL;
> +
>   /* Don't invoke OOM killer */
>   gfp |= __GFP_NORETRY;
>  
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] maps: /proc//pmaps interface - memory maps in granularity of pages

2007-08-18 Thread Matt Mackall

On Sat, Aug 18, 2007 at 04:45:31PM +0800, Fengguang Wu wrote:
> Matt,
> 
> On Sat, Aug 18, 2007 at 01:40:42AM -0500, Matt Mackall wrote:
> > > - On memory pressure,
> > >   - as VSZ goes up, RSS will be bounded by physical memory.
> > > So VSZ:RSS ratio actually goes up with memory pressure.
> > 
> > And yes.
> > 
> > But that's not what I'm talking about. You're likely to have more
> > holes in your ranges with memory pressure as things that aren't active
> > get paged or swapped out and back in. And because we're walking the
> > LRU more rapidly, we'll flip over a lot of the active bits more often
> > which will mean more output.
> > 
> > >   - page range is a good unit of locality. They are more likely to be
> > > reclaimed as a whole. So (RSS:page_ranges) wouldn't degrade as much.
> > 
> > There is that. The relative magnitude of the different effects is
> > unclear. But it is clear that the worst case for pmap is much worse
> 
> > than pagemap (two lines per page of RSS?). 
> It's one line per page. No sane app will make vmas proliferate.

Sane apps are few and far between.
 
> So let's talk about the worst case.
> 
> pagemap's data set size is determined by VSZ.
> 4GB VSZ means 1M PFNs, hence 8MB pagemap data.
> 
> pmaps's data set size is bounded by RSS hence physical memory.
> 4GB RSS means up to 1M page ranges, hence ~20M pmaps data.
> Not too bad :)

Hmmm, I've been misreading the output.

What does it do with nonlinear VMAs?

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/9] define global BIT macro

2007-08-18 Thread Jiri Slaby

Randy Dunlap napsal(a):
> On Sat, 18 Aug 2007 11:44:12 +0200 (CEST) Jiri Slaby wrote:
> 
>> define global BIT macro
>>
>> move all local BIT defines to the new globally define macro.
>>
>> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>
>>
>> ---
>>
>>  include/linux/bitops.h  |1 +
>>  include/video/sstfb.h   |1 -
>>  include/video/tdfx.h|2 --
>>  net/mac80211/ieee80211_i.h  |2 --
>>  18 files changed, 1 insertions(+), 37 deletions(-)
>>
>> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>> index 3255b06..a57b81f 100644
>> --- a/include/linux/bitops.h
>> +++ b/include/linux/bitops.h
>> @@ -3,6 +3,7 @@
>>  #include 
>>  
>>  #ifdef  __KERNEL__
>> +#define BIT(nr) (1UL << (nr))
>>  #define BIT_MASK(nr)(1UL << ((nr) % BITS_PER_LONG))
>>  #define BIT_WORD(nr)((nr) / BITS_PER_LONG)
>>  #define BITS_TO_TYPE(nr, t) (((nr)+(t)-1)/(t))
> 
> 
> So users of the BIT() macro in include/linux/input.h can be
> changed to use the global BIT_MASK() macro...
> and the former can be removed.

I'm afraid I don't understand you. Maybe, you are writing about changes done in
patch no. 7 [1], which didn't go through to the lkml?

[1]
http://www.fi.muni.cz/~xslaby/sklad/07-get-rid-of-input-bit-duplicate-defines.patch

thanks,
-- 
Jiri Slaby ([EMAIL PROTECTED])
Faculty of Informatics, Masaryk University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS

2007-08-18 Thread J. Bruce Fields

On Sat, Aug 18, 2007 at 10:00:42AM -0700, Joe Perches wrote:
> On Sat, 2007-08-18 at 11:00 -0400, J. Bruce Fields wrote:
> > Also, some paths that are shared with the client:
> > +F: fs/lockd/
> > +F: fs/nfs_common/
> > +F: net/sunrpc/
> > +F: include/linux/lockd/
> > +F: include/linux/sunrpc/
> 
> This is what I have now:

Looks correct, thanks.--b.

> KERNEL NFSD
> P:J. Bruce Fields
> M:[EMAIL PROTECTED]
> P:Neil Brown
> M:[EMAIL PROTECTED]
> L:[EMAIL PROTECTED]
> W:http://nfs.sourceforge.net/
> S:Supported
> F:fs/nfsd/
> F:include/linux/nfsd/
> F:fs/lockd/
> F:fs/nfs_common/
> F:net/sunrpc/
> F:include/linux/lockd/
> F:include/linux/sunrpc/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS

2007-08-18 Thread Joe Perches

On Sat, 2007-08-18 at 10:05 -0700, Randy Dunlap wrote:
> > +   PLEASE include the appropriate maintainers and developers
> > +   that have modified files touched by your patch by using the
> s/that/who/
> > +   automated CC generator (scripts/get_maintainer.pl)

Right.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS

2007-08-18 Thread Randy Dunlap

On Sat, 18 Aug 2007 00:08:54 -0700 Joe Perches wrote:

> Added patterns to describe files maintained.
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> 
> 
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d3a0684..d7fe1c5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -19,40 +19,44 @@ trivial patch so apply some common sense.
>  3.   Make sure your changes compile correctly in multiple
>   configurations. In particular check that changes work both as a
>   module and built into the kernel.
>  
>  4.   When you are happy with a change make it generally available for
>   testing and await feedback.
>  
>  5.   Make a patch available to the relevant maintainer in the list. Use
>   'diff -u' to make the patch easy to merge. Be prepared to get your
>   changes sent back with seemingly silly requests about formatting
>   and variable names.  These aren't as silly as they seem. One
>   job the maintainers (and especially Linus) do is to keep things
>   looking the same. Sometimes this means that the clever hack in
>   your driver to get around a problem actually needs to become a
>   generalized kernel feature ready for next time.
>  
>   PLEASE check your patch with the automated style checker
>   (scripts/checkpatch.pl) to catch trival style violations.
>   See Documentation/CodingStyle for guidance here.
>  
> + PLEASE include the appropriate maintainers and developers
> + that have modified files touched by your patch by using the

s/that/who/

> + automated CC generator (scripts/get_maintainer.pl)


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS

2007-08-18 Thread Joe Perches

On Sat, 2007-08-18 at 11:00 -0400, J. Bruce Fields wrote:
> Also, some paths that are shared with the client:
> +F:   fs/lockd/
> +F:   fs/nfs_common/
> +F:   net/sunrpc/
> +F:   include/linux/lockd/
> +F:   include/linux/sunrpc/

This is what I have now:

KERNEL NFSD
P:  J. Bruce Fields
M:  [EMAIL PROTECTED]
P:  Neil Brown
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]
W:  http://nfs.sourceforge.net/
S:  Supported
F:  fs/nfsd/
F:  include/linux/nfsd/
F:  fs/lockd/
F:  fs/nfs_common/
F:  net/sunrpc/
F:  include/linux/lockd/
F:  include/linux/sunrpc/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thinking outside the box on file systems

2007-08-18 Thread Marc Perkel

--- Kyle Moffett <[EMAIL PROTECTED]> wrote:

> On Aug 17, 2007, at 15:01:48, Phillip Susi wrote:
> > [EMAIL PROTECTED] wrote:
> >> It will become even *more* of a "not that common"
> if the lock will  
> >> block moves and ACL changes *across the
> filesystem* for  
> >> potentially *minutes* at a time.
> >
> > It will not take anywhere NEAR minutes at a time
> to update the in  
> > memory dentries, more like 50ms.
> 
> One last comment:
> 
> 50ms to update in-memory dentries would be FRIGGING
> TERRIBLE!!!   
> Using Perl, an interpreted language, the following
> script takes 3.39s  
> to run on one of my lower-end systems:
> 
> for (0 .. 1) {
>   mkdir "a-$_";
>   mkdir "b-$_";
>   rename "a-$_", "b-$_";
> }
> 
> It's not even deleting things afterwards so it's
> populating a  
> directory with ten thousand entries.  We can easily
> calculate  
> 10,000/3.39 = 2,949 entries per second, or 0.339
> milliseconds per entry.
> 
> When I change it to rmdir things instead, the
> runtime goes down to  
> 2.89s == 3460 entries/sec == 0.289 milliseconds per
> entry.
> 
> If such a scheme even increases the overhead of a
> directory rename by  
> a hundredth of a millisecond on that box it would
> easily be a 2-3%  
> performance hit.  Given that people tend to kill for
> 1% performance  
> boosts, that's not likely to be a good idea.
> 
> Cheers,
> Kyle Moffett
> 
> 

What I suggested was a concept of a new way to look at
a file system. What you are arguing here is why it
wouldn't work based on your theories as to how such a
file system would be implemented. In attacking how
slow you think it might be you are making assumptions
that wouldn't apply to how this would be implemented.
You are assuming that it would be implemented in ways
that you are familiar with. That is a wrong
assumption.

Linux isn't going to make progress when people try to
figure out how to make something NOT work rather than
to make something work. So if you are going to put
effort into this then why not try to figure out how to
get around the issues you are raising rather than to
attack the idea as unsolvable.

When I originally suggested that the names would be a
"hash" I didn't mean that it is going to be only a
hash. You have successfully argued that just a hash
would have problems. Which means that a real solution
is going to be more complex.

I suggest that it would be easier to figure out how to
make moves of large directory structure fast and
effecient with automatic inheritance of rights.

I know it can be done because Microsoft is doing it
and Novell Netware was doing it 20 years ago. So the
fact that it is done by others disproves your
arguments that it can't be done.

Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com

Moody friends. Drama queens. Your life? Nope! - their life, your story. Play 
Sims Stories at Yahoo! Games.
http://sims.yahoo.com/  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/9] define global BIT macro

2007-08-18 Thread Randy Dunlap

On Sat, 18 Aug 2007 11:44:12 +0200 (CEST) Jiri Slaby wrote:

> define global BIT macro
> 
> move all local BIT defines to the new globally define macro.
> 
> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]>
> 
> ---
> 
>  include/linux/bitops.h  |1 +
>  include/video/sstfb.h   |1 -
>  include/video/tdfx.h|2 --
>  net/mac80211/ieee80211_i.h  |2 --
>  18 files changed, 1 insertions(+), 37 deletions(-)
> 
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 3255b06..a57b81f 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -3,6 +3,7 @@
>  #include 
>  
>  #ifdef   __KERNEL__
> +#define BIT(nr)  (1UL << (nr))
>  #define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
>  #define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
>  #define BITS_TO_TYPE(nr, t)  (((nr)+(t)-1)/(t))


So users of the BIT() macro in include/linux/input.h can be
changed to use the global BIT_MASK() macro...
and the former can be removed.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [540/2many] MAINTAINERS - WATCHDOG DEVICE DRIVERS

2007-08-18 Thread Wim Van Sebroeck

Hi Joe,

> > Patch removed until there is a consensus on how to proceed with your 
> > proposal.
> 
> Hi Wim.
> 
> I think that's wise.
> 
> I've got all the changes that people have CC'd me.
> I expect it'll be an all or nothing sort of thing.

My opinion: the patch you sent me was just adding the differen watchdog
related files/directories to my maintainers entry. Since someone (in the past)
did not sent in a patch directly to me because he had searched only on WDT and
not watchdog, I thought: why not, it can only improve the searching for
a maintainer.

But what Linus said is indeed true: managing the MAINTAINERS file is difficult.
How many maintainer entry's would we have where we don't know that the person
stopped developing (and thus maintaining), or that the person died, or that he
is unreachable for any outside communication (because he is in prison or because
he choose to retire on a beautifull island with only palm-trees and quiteness),
...
Example: We recently removed the maintainer entry for Ken Hollis (who definitely
was involved in the first watchdog device drivers) because he was not 
"reachable"
anymore via E-mail. Who knows what happened...

So the maintenance of the "maintainers" will always be difficult. Adding a new
entry is part of the development process, but removing one not. And this is
regardless of whatever "system" you use to store the maintainers.

So even if your patches will not be accepted in it's current form, I think it
is/was good that you had a look at all the maintainers file and tried to find
out how accurate the info was.

Greetings,
Wim.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt


On Aug 18 2007 17:28, Chris Boot wrote:
>
> I will. This will probably be on Monday now, since the machine isn't
> accepting SysRq requests over the serial console. :-(

Ah yeah, stupid null-modem cables!
You can also trigger sysrq from /proc/sysrq-trigger (well, as long
as the system lives)

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot


Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

Måns Rullgård wrote:


Chris Boot <[EMAIL PROTECTED]> writes:


  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev


[...]
  

[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

  

Thanks, that explains a lot. However, I don't have any XFS filesystems
mounted over loop devices on ext3. Earlier in the day I had iso9660 on
loop on xfs, could that have caused the issue? It was unmounted and
deleted when this panic occurred.



The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.
  


I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.


I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.



I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: power off disk drives while running

2007-08-18 Thread Jan Engelhardt

On Aug 18 2007 12:08, Marty Leisner wrote:
>
>In embedded system design, it may be useful to poweroff the disks (as opposed
>to merely spinning them down).  We want to leave the system running while
>the disk is powered down, and let the disk powerup when it needs to be 
>spun up.

That means you also have to power it on...

>While the "power off mechanism" would be platform dependent, is there a
>generic path to announce "prepare for power going away"?

I do not see why that would be needed from a software point of view. Just make
sure that the disk does not needlessy emergency-park when pulling power. When
someone wants to write to disk, the request goes to the device driver, which
hands it to the controller, which hands it to the disk. And your controller
should be able to handle it (e.g. wait until reconnect) when there is a request
for a disk that is powered off.

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix f_version type: should be u64 instead of unsigned long

2007-08-18 Thread Mathieu Desnoyers

Fix f_version type: should be u64 instead of long

There is a type inconsistency between struct inode i_version and struct file
f_version.

fs.h:

struct inode
  u64 i_version;

and

struct file
  unsigned long   f_version;

Users do:

fs/ext3/dir.c:

if (filp->f_version != inode->i_version) {

So why isn't f_version a u64 ? It becomes a problem if versions gets
higher than 2^32 and we are on an architecture where longs are 32 bits.

This patch changes the f_version type to u64, and updates the users accordingly.

It applies to 2.6.23-rc2-mm2.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Martin Bligh <[EMAIL PROTECTED]>
CC: Randy Dunlap <[EMAIL PROTECTED]>
CC: Al Viro <[EMAIL PROTECTED]>
---
 fs/ext3/dir.c|2 +-
 fs/ext4/dir.c|2 +-
 fs/ocfs2/dir.c   |2 +-
 fs/proc/base.c   |4 ++--
 include/linux/fs.h   |2 +-
 include/linux/seq_file.h |2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6-lttng/include/linux/fs.h
===
--- linux-2.6-lttng.orig/include/linux/fs.h 2007-08-18 11:05:10.0 
-0400
+++ linux-2.6-lttng/include/linux/fs.h  2007-08-18 11:05:56.0 -0400
@@ -799,7 +799,7 @@ struct file {
unsigned intf_uid, f_gid;
struct file_ra_statef_ra;
 
-   unsigned long   f_version;
+   u64 f_version;
 #ifdef CONFIG_SECURITY
void*f_security;
 #endif
Index: linux-2.6-lttng/fs/ext3/dir.c
===
--- linux-2.6-lttng.orig/fs/ext3/dir.c  2007-08-18 11:08:25.0 -0400
+++ linux-2.6-lttng/fs/ext3/dir.c   2007-08-18 11:08:32.0 -0400
@@ -210,7 +210,7 @@ revalidate:
 * not the directory has been modified
 * during the copy operation.
 */
-   unsigned long version = filp->f_version;
+   u64 version = filp->f_version;
 
error = filldir(dirent, de->name,
de->name_len,
Index: linux-2.6-lttng/fs/ext4/dir.c
===
--- linux-2.6-lttng.orig/fs/ext4/dir.c  2007-08-18 11:08:47.0 -0400
+++ linux-2.6-lttng/fs/ext4/dir.c   2007-08-18 11:08:53.0 -0400
@@ -210,7 +210,7 @@ revalidate:
 * not the directory has been modified
 * during the copy operation.
 */
-   unsigned long version = filp->f_version;
+   u64 version = filp->f_version;
 
error = filldir(dirent, de->name,
de->name_len,
Index: linux-2.6-lttng/fs/ocfs2/dir.c
===
--- linux-2.6-lttng.orig/fs/ocfs2/dir.c 2007-08-18 11:09:23.0 -0400
+++ linux-2.6-lttng/fs/ocfs2/dir.c  2007-08-18 11:09:30.0 -0400
@@ -183,7 +183,7 @@ revalidate:
 * not the directory has been modified
 * during the copy operation.
 */
-   unsigned long version = filp->f_version;
+   u64 version = filp->f_version;
unsigned char d_type = DT_UNKNOWN;
 
if (de->file_type < OCFS2_FT_MAX)
Index: linux-2.6-lttng/fs/proc/base.c
===
--- linux-2.6-lttng.orig/fs/proc/base.c 2007-08-18 11:11:21.0 -0400
+++ linux-2.6-lttng/fs/proc/base.c  2007-08-18 11:11:39.0 -0400
@@ -2570,7 +2570,7 @@ static int proc_task_readdir(struct file
/* f_version caches the tgid value that the last readdir call couldn't
 * return. lseek aka telldir automagically resets f_version to 0.
 */
-   tid = filp->f_version;
+   tid = (int)filp->f_version;
filp->f_version = 0;
for (task = first_tid(leader, tid, pos - 2);
 task;
@@ -2579,7 +2579,7 @@ static int proc_task_readdir(struct file
if (proc_task_fill_cache(filp, dirent, filldir, task, tid) < 0) 
{
/* returning this tgid failed, save it as the first
 * pid for the next readir call */
-   filp->f_version = tid;
+   filp->f_version = (u64)tid;
put_task_struct(task);
break;
}
Index: linux-2.6-lttng/include/linux/seq_file.h
===
---

power off disk drives while running

2007-08-18 Thread Marty Leisner


In embedded system design, it may be useful to poweroff the disks (as opposed
to merely spinning them down).  We want to leave the system running while
the disk is powered down, and let the disk powerup when it needs to be 
spun up.

While the "power off mechanism" would be platform dependent, is there a
generic path to announce "prepare for power going away"?


marty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/2] Sort module list by pointer address to get coherent sleepable seq_file iterators

2007-08-18 Thread Mathieu Desnoyers

* Fengguang Wu ([EMAIL PROTECTED]) wrote:
> Al Viro,
> 
> Does this sounds like a good fix?
> ===
> 
> seq_file version fixes
> 
> - f_version is 'unsigned long', it's pointless to do more than that.

Hrm, this is weird...

fs.h:

struct inode
  u64 i_version;

and

struct file
  unsigned long   f_version;

Users do:

fs/ext3/dir.c:

if (filp->f_version != inode->i_version) {

So why isn't f_version a u64 ? It becomes a problem if versions gets
higher than 2^32 and we are on an architecture where longs are 32 bits.
I think the problem is the f_version field type, not in seq_file at all.
I'll prepare a patch for this.

> - m->version should not be reset when we are bumping up the buf size.
> 

Hr, what is this twisted use of versions anyway ?!?

If I look at other version users elsewhere in the kernel, they mostly
do:

repeat:
f_version = i_version
do something
if (f_version != i_version)
   repeat;

So they can see if the underlying inode has changed during the
operation. seq_file does it completely the other way around:

m->version = f_version;
do something

and, well, versions are never really used at all.

If we want to use versioning there, we should keep a version counter
associated with the ressource pointed used by seq_files that would be
incremented each time the data structures are modified.

Then, in the read side, we could sanely do:

seq open():
f_version = current version

seq read():
repeat:
m->version = f_version;
do something
if (m->version != current version)
  repeat;

This would only make sure that the given read operation has consistent
data. It would not certify data consistency across reads.

I have looked at fs/proc.c/task_mmu.c use of m->version, and I think it
is just really weird. I think the proper way to do it would be to put
the last_addr in a field of a structure to which m->private would point
to.

Mathieu

> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]>
> ---
>  fs/seq_file.c|1 -
>  include/linux/seq_file.h |2 +-
>  2 files changed, 1 insertion(+), 2 deletions(-)
> 
> --- linux-2.6.23-rc3.orig/include/linux/seq_file.h
> +++ linux-2.6.23-rc3/include/linux/seq_file.h
> @@ -18,7 +18,7 @@ struct seq_file {
>   size_t from;
>   size_t count;
>   loff_t index;
> - loff_t version;
> + unsigned long version;
>   struct mutex lock;
>   const struct seq_operations *op;
>   void *private;
> --- linux-2.6.23-rc3.orig/fs/seq_file.c
> +++ linux-2.6.23-rc3/fs/seq_file.c
> @@ -134,7 +134,6 @@ ssize_t seq_read(struct file *file, char
>   if (!m->buf)
>   goto Enomem;
>   m->count = 0;
> - m->version = 0;
>   }
>   m->op->stop(m, p);
>   m->count = 0;
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH trivial] include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros

2007-08-18 Thread Paul Mundt

On Sat, Aug 18, 2007 at 05:32:05PM +0200, Stefan Richter wrote:
> These don't appear anywhere else in the kernel anymore.
> 
> Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
> Cc: Bryan Wu <[EMAIL PROTECTED]>
> Cc: Yoshinori Sato <[EMAIL PROTECTED]>
> Cc: Greg Ungerer <[EMAIL PROTECTED]>
> Cc: Paul Mundt <[EMAIL PROTECTED]>
> Cc: Miles Bader <[EMAIL PROTECTED]>
> ---
>  include/asm-sh64/system.h  |3 +--
> 

Acked-by: Paul Mundt <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård

Chris Boot <[EMAIL PROTECTED]> writes:

> Måns Rullgård wrote:
>> Chris Boot <[EMAIL PROTECTED]> writes:
>>
>>
>>> All,
>>>
>>> I've got a box running RHEL5 and haven't been impressed by ext3
>>> performance on it (running of a 1.5TB HP MSA20 using the cciss
>>> driver). I compiled XFS as a module and tried it out since I'm used to
>>> using it on Debian, which runs much more efficiently. However, every
>>> so often the kernel panics as below. Apologies for the tainted kernel,
>>> but we run VMware Server on the box as well.
>>>
>>> Does anyone have any hits/tips for using XFS on Red Hat? What's
>>> causing the panic below, and is there a way around this?
>>>
>>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>>> printing eip:
>>> c0415974
>>> *pde = 
>>> Oops:  [#1]
>>> SMP last sysfs file: /block/loop7/dev
[...]
>>> [] xfsbufd_wakeup+0x28/0x49 [xfs]
>>> [] shrink_slab+0x56/0x13c
>>> [] try_to_free_pages+0x162/0x23e
>>> [] __alloc_pages+0x18d/0x27e
>>> [] find_or_create_page+0x53/0x8c
>>> [] __getblk+0x162/0x270
>>> [] do_lookup+0x53/0x157
>>> [] ext3_getblk+0x7c/0x233 [ext3]
>>> [] ext3_getblk+0xeb/0x233 [ext3]
>>> [] mntput_no_expire+0x11/0x6a
>>> [] ext3_bread+0x13/0x69 [ext3]
>>> [] htree_dirblock_to_tree+0x22/0x113 [ext3]
>>> [] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
>>> [] do_path_lookup+0x20e/0x25f
>>> [] get_empty_filp+0x99/0x15e
>>> [] ext3_permission+0x0/0xa [ext3]
>>> [] ext3_readdir+0x1ce/0x59b [ext3]
>>> [] filldir+0x0/0xb9
>>> [] sys_fstat64+0x1e/0x23
>>> [] vfs_readdir+0x63/0x8d
>>> [] filldir+0x0/0xb9
>>> [] sys_getdents+0x5f/0x9c
>>> [] syscall_call+0x7/0xb
>>> ===
>>>
>>
>> Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>> seems to be enough to overflow it.
>>
> Thanks, that explains a lot. However, I don't have any XFS filesystems
> mounted over loop devices on ext3. Earlier in the day I had iso9660 on
> loop on xfs, could that have caused the issue? It was unmounted and
> deleted when this panic occurred.

The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.

> I'll probably just try and recompile the kernel with 8k stacks and see
> how it goes. Screw the support, we're unlikely to get it anyway. :-P

Please report how this works out.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [draft] Blackfin Early Printk implmentation

2007-08-18 Thread Robin Getz

On Sat 18 Aug 2007 02:23, Sam Ravnborg pondered:
> > > What was preventing you from just using the x86_64 code here?
> > 
> > Some was borrowed - but not much. since we don't support vga, or 
> > 16550 UARTs (Blackfin has it's own on-chip UART), I don't think 
> > this would work.  Everyone implements implements direct IO to 
> > the hardware (except me, since I  put it into the driver file,
> > and force Sonic - the serial driver developer - to maintain it
> > forever). 
> > 
> > Most of the other early printks talks directly to the hardware.
> I only looked at your version and it looked general thats why I brought
> up the code sharing idea - which I agree is not possible.

Believe me - I would actually like this more - put the I/O parts into the 
serial driver or vga driver or xxx driver - and early printk becomes a 
generic function that is supported on every platform, with a CON_BOOT 
defined.

But, I didn't want (or have the time) to go mucking in everyone else's 
arch/drivers to move things around - but the more I think about it - the 
better it would be. Maybe on my next long plane trip I will look at it.

> > > Thinking that all should do the same so maybe alpha ought to
> > > change...
> > 
> > When I looked at all the printk implementations, I thought they were
> > all kind of hokey, and not very common - but what do you want for a
> > debug interface that lasts less than 5 seconds?
> > 
> > ./arch/x86_64/kernel/early_printk.c
> > ./arch/blackfin/kernel/early_printk.c
> > ./arch/sh64/kernel/early_printk.c
> > ./arch/sh/kernel/early_printk.c
> > ./arch/i386/kernel/early_printk.c
> > ./arch/mips/kernel/early_printk.c
> > 
> > I didn't see an alpha implementation - where is it done?
> Alpha uses the imlementation in lib/*print.c somehow.
> And I think the right choice would be to implement
> a private version of early_printk for alpha like the
> other architectures do.

When looked for EARLY_PRINTK in ./arch/alpha - and include/asm-alpha, it only 
shows up in the Kconfig files - nothing seems to use it...

> Thanks for the split-up. I could follow the changes now.

Any issues/comments? Or do things look OK?

-Robin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH trivial] include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros

2007-08-18 Thread Stefan Richter

These don't appear anywhere else in the kernel anymore.

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
Cc: Bryan Wu <[EMAIL PROTECTED]>
Cc: Yoshinori Sato <[EMAIL PROTECTED]>
Cc: Greg Ungerer <[EMAIL PROTECTED]>
Cc: Paul Mundt <[EMAIL PROTECTED]>
Cc: Miles Bader <[EMAIL PROTECTED]>
---
 include/asm-blackfin/system.h  |4 +---
 include/asm-h8300/system.h |3 +--
 include/asm-m68knommu/system.h |3 +--
 include/asm-sh64/system.h  |3 +--
 include/asm-v850/system.h  |3 +--
 5 files changed, 5 insertions(+), 11 deletions(-)

Index: linux/include/asm-blackfin/system.h
===
--- linux.orig/include/asm-blackfin/system.h
+++ linux/include/asm-blackfin/system.h
@@ -119,9 +119,7 @@ extern unsigned long irq_flags;
 #define mb()   asm volatile (""   : : :"memory")
 #define rmb()  asm volatile (""   : : :"memory")
 #define wmb()  asm volatile (""   : : :"memory")
-#define set_rmb(var, value)do { (void) xchg(, value); } while (0)
-#define set_mb(var, value) set_rmb(var, value)
-#define set_wmb(var, value)do { var = value; wmb(); } while (0)
+#define set_mb(var, value) do { (void) xchg(, value); } while (0)
 
 #define read_barrier_depends() do { } while(0)
 
Index: linux/include/asm-h8300/system.h
===
--- linux.orig/include/asm-h8300/system.h
+++ linux/include/asm-h8300/system.h
@@ -82,8 +82,7 @@ asmlinkage void resume(void);
 #define mb()   asm volatile (""   : : :"memory")
 #define rmb()  asm volatile (""   : : :"memory")
 #define wmb()  asm volatile (""   : : :"memory")
-#define set_rmb(var, value)do { xchg(, value); } while (0)
-#define set_mb(var, value) set_rmb(var, value)
+#define set_mb(var, value) do { xchg(, value); } while (0)
 
 #ifdef CONFIG_SMP
 #define smp_mb()   mb()
Index: linux/include/asm-m68knommu/system.h
===
--- linux.orig/include/asm-m68knommu/system.h
+++ linux/include/asm-m68knommu/system.h
@@ -104,8 +104,7 @@ asmlinkage void resume(void);
 #define mb()   asm volatile (""   : : :"memory")
 #define rmb()  asm volatile (""   : : :"memory")
 #define wmb()  asm volatile (""   : : :"memory")
-#define set_rmb(var, value)do { xchg(, value); } while (0)
-#define set_mb(var, value) set_rmb(var, value)
+#define set_mb(var, value) do { xchg(, value); } while (0)
 
 #ifdef CONFIG_SMP
 #define smp_mb()   mb()
Index: linux/include/asm-sh64/system.h
===
--- linux.orig/include/asm-sh64/system.h
+++ linux/include/asm-sh64/system.h
@@ -62,8 +62,7 @@ extern void __xchg_called_with_bad_point
 #define smp_read_barrier_depends() do { } while (0)
 #endif /* CONFIG_SMP */
 
-#define set_rmb(var, value) do { (void)xchg(, value); } while (0)
-#define set_mb(var, value) set_rmb(var, value)
+#define set_mb(var, value) do { (void)xchg(, value); } while (0)
 
 /* Interrupt Control */
 #ifndef HARD_CLI
Index: linux/include/asm-v850/system.h
===
--- linux.orig/include/asm-v850/system.h
+++ linux/include/asm-v850/system.h
@@ -66,8 +66,7 @@ static inline int irqs_disabled (void)
 #define rmb()  mb ()
 #define wmb()  mb ()
 #define read_barrier_depends() ((void)0)
-#define set_rmb(var, value)do { xchg (, value); } while (0)
-#define set_mb(var, value) set_rmb (var, value)
+#define set_mb(var, value) do { xchg (, value); } while (0)
 
 #define smp_mb()   mb ()
 #define smp_rmb()  rmb ()

-- 
Stefan Richter
-=-=-=== =--- =--=-
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drivers/infiniband/mlx/mad.c misplaced ;

2007-08-18 Thread Daniel Schaffrath



On 2007/08/16  , at 13:01, Karsten Keil wrote:


On Thu, Aug 16, 2007 at 01:22:04PM +0300, Ilpo Järvinen wrote:


...I guess those guys hunting for broken busyloops in the other  
thread

could also benefit from similar searching commands introduced in this
thread... ...Ccing Satyam to caught their attention too.


./drivers/isdn/hisax/hfc_pci.c
125:if (Read_hfc(cs, HFCPCI_INT_S1)) ;
155:if (Read_hfc(cs, HFCPCI_INT_S1)) ;
1483:   if (Read_hfc(cs, HFCPCI_INT_S1)) ;
--
./drivers/isdn/hisax/hfc_sx.c
377:if (Read_hfc(cs, HFCSX_INT_S1)) ;
407:if (Read_hfc(cs, HFCSX_INT_S2)) ;
1246:   if (Read_hfc(cs, HFCSX_INT_S1)) ;
--


These are workaround to not get compiler warnings about ignored return
values I got some time ago under some architecture.
Maybe '(void) Read_hfc(cs, HFCSX_INT_S1)' is a better option to get  
rid of the warnings.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] MAINTAINERS

2007-08-18 Thread J. Bruce Fields

On Sat, Aug 18, 2007 at 12:08:54AM -0700, Joe Perches wrote:
>  KERNEL NFSD
>  P:   J. Bruce Fields
>  M:   [EMAIL PROTECTED]
>  P:   Neil Brown
>  M:   [EMAIL PROTECTED]
>  L:   [EMAIL PROTECTED]
>  W:   http://nfs.sourceforge.net/
>  S:   Supported
> +F:   fs/nfsd/
> +F:   include/linux/nfsd/

Also, some paths that are shared with the client:

+F: fs/lockd/
+F: fs/nfs_common/
+F: net/sunrpc/
+F: include/linux/lockd/
+F: include/linux/sunrpc/

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86-64: memset optimization

2007-08-18 Thread Stephen Hemminger

On Sat, 18 Aug 2007 11:46:24 +0200
Andi Kleen <[EMAIL PROTECTED]> wrote:

> On Saturday 18 August 2007 01:34:46 Stephen Hemminger wrote:
> > Optimize uses of memset with small constant offsets.
> > This will generate smaller code, and avoid the slow rep/string instructions.
> > Code copied from i386 with a little cleanup.
> 
> 
> Newer gcc should do all this on its own.  That is why I intentionally
> didn't implement it on 64bit.
> 
> On what compiler version did you see smaller code?
> 
> -Andi
> 

The problem is that on x86-64 you are overriding memset() so the builtin
version doesn't kick in.  You allow gcc to inline memcpy but not memset.

What about adding code similar to memcpy() stuff.

--- a/include/asm-x86_64/string.h   2007-08-18 07:37:58.0 -0700
+++ b/include/asm-x86_64/string.h   2007-08-18 07:44:31.0 -0700
@@ -43,8 +43,13 @@ extern void *__memcpy(void *to, const vo
   __ret; }) 
 #endif
 
-#define __HAVE_ARCH_MEMSET
-void *memset(void *s, int c, size_t n);
+#define __HAVE_ARCH_MEMSET 1
+#if (__GNUC__ == 4 && __GNUC_MINOR__ >= 3) || __GNUC__ > 4
+extern void memset(void *s, int c, size_t n);
+#else
+#define memset(s, c, n) __builtin_memset((s),(c),(n))
+#endif
+
 
 #define __HAVE_ARCH_MEMMOVE
 void * memmove(void * dest,const void *src,size_t count);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot


Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:1
EIP:0060:[]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
smp_send_reschedule+0x3/0x53
eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
000f  0001 0001 c200c6e0 0100 
0069 0180 018fc500 c200d240 0003 0292 f601efc0
f6027e00  0050 Call Trace:
[] try_to_wake_up+0x351/0x37b
[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.
  
Thanks, that explains a lot. However, I don't have any XFS filesystems 
mounted over loop devices on ext3. Earlier in the day I had iso9660 on 
loop on xfs, could that have caused the issue? It was unmounted and 
deleted when this panic occurred.


I'll probably just try and recompile the kernel with 8k stacks and see 
how it goes. Screw the support, we're unlikely to get it anyway. :-P


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)

2007-08-18 Thread Stefan Richter

Nick Piggin wrote:
> Stefan Richter wrote:
>> Nick Piggin wrote:
>>
>>> I don't know why people would assume volatile of atomics. AFAIK, most
>>> of the documentation is pretty clear that all the atomic stuff can be
>>> reordered etc. except for those that modify and return a value.
>>
>>
>> Which documentation is there?
> 
> Documentation/atomic_ops.txt
> 
> 
>> For driver authors, there is LDD3.  It doesn't specifically cover
>> effects of optimization on accesses to atomic_t.
>>
>> For architecture port authors, there is Documentation/atomic_ops.txt.
>> Driver authors also can learn something from that document, as it
>> indirectly documents the atomic_t and bitops APIs.
>>
> 
> "Semantics and Behavior of Atomic and Bitmask Operations" is
> pretty direct :)
> 
> Sure, it says that it's for arch maintainers, but there is no
> reason why users can't make use of it.


Note, LDD3 page 238 says:  "It is worth noting that most of the other
kernel primitives dealing with synchronization, such as spinlock and
atomic_t operations, also function as memory barriers."

I don't know about Linux 2.6.10 against which LDD3 was written, but
currently only _some_ atomic_t operations function as memory barriers.

Besides, judging from some posts in this thread, saying that atomic_t
operations dealt with synchronization may not be entirely precise.
-- 
Stefan Richter
-=-=-=== =--- =--=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 002 of 6] Introduce rq_for_each_segment replacing rq_for_each_bio

2007-08-18 Thread Satyam Sharma

On Sat, 18 Aug 2007, Jan Engelhardt wrote:

> On Aug 18 2007 20:07, Satyam Sharma wrote:
> >On Fri, 17 Aug 2007, Geert Uytterhoeven wrote:
> >
> >> On Thu, 16 Aug 2007, NeilBrown wrote:
> >> [...]
> >> >  dev_dbg(>sbd.core,
> >> >  "%s:%u: bio %u: %u segs %u sectors from %lu\n",
> >> > -__func__, __LINE__, i, bio_segments(bio),
> >> > -bio_sectors(bio), sector);
> >> > -bio_for_each_segment(bvec, bio, j) {
> >> > +__func__, __LINE__, i, bio_segments(iter.bio),
> >> > +bio_sectors(iter.bio),
> >> > +(unsigned long)iter.bio->bi_sector);
> >> ^^^
> >> Superfluous cast: PS3 is 64-bit only, and casts are evil.
> 
> bi_sector is sector_t. The cast is ok, because printf will warn, and 
> rightfully
> so since sector_t may just change its shape underneath. It would not be so 
> much
> of a problem if printf() was not a varargs function, but it is, and hence,
> passing an object bigger than the format specifier can make problems.

Oh yeah, that's why the _cast_ _is_ needed in the first place, by the way.
I was mentioning why the cast itself should be (unsigned long long) otoh.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma

On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > GCC manual, section 6.1, "When
  ^^
> > > is a Volatile Object Accessed?" doesn't say anything of the
  ^^^
> > > kind.
  ^

> > True, "implementation-defined" as per the C standard _is_ supposed to mean
^

> > "unspecified behaviour where each implementation documents how the choice
> > is made". So ok, probably GCC isn't "documenting" this

> > implementation-defined behaviour which it is supposed to, but can't really
> > fault them much for this, probably.
> 
> GCC _is_ documenting this, namely in this section 6.1.

(Again totally petty, but) Yes, but ...

> It doesn't
  ^^
> mention volatile-casted stuff.  Draw your own conclusions.
  ^^

... exactly. So that's why I said "GCC isn't documenting _this_".

Man, try _reading_ mails before replying to them ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 002 of 6] Introduce rq_for_each_segment replacing rq_for_each_bio

2007-08-18 Thread Jan Engelhardt


On Aug 18 2007 20:07, Satyam Sharma wrote:
>On Fri, 17 Aug 2007, Geert Uytterhoeven wrote:
>
>> On Thu, 16 Aug 2007, NeilBrown wrote:
>> [...]
>> >dev_dbg(>sbd.core,
>> >"%s:%u: bio %u: %u segs %u sectors from %lu\n",
>> > -  __func__, __LINE__, i, bio_segments(bio),
>> > -  bio_sectors(bio), sector);
>> > -  bio_for_each_segment(bvec, bio, j) {
>> > +  __func__, __LINE__, i, bio_segments(iter.bio),
>> > +  bio_sectors(iter.bio),
>> > +  (unsigned long)iter.bio->bi_sector);
>> ^^^
>> Superfluous cast: PS3 is 64-bit only, and casts are evil.

bi_sector is sector_t. The cast is ok, because printf will warn, and rightfully
so since sector_t may just change its shape underneath. It would not be so much
of a problem if printf() was not a varargs function, but it is, and hence,
passing an object bigger than the format specifier can make problems.


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

strange probe order issue after platform_add_devices()

2007-08-18 Thread hufey

Hi list,

I am facing a very strange issue about device probe order.
I have a piece of code to support network interface card for an embedded box.

It likes following:

...
static struct platform_device *custom_devices[] __initdata = {
  _device,
  _device,
  _device,
};

static int __init device_init()
{
   return platform_add_devices(custom_devices,
ARRAY_SIZE(custom_devices);
}

subsys_initcall(device_init);

/* end */

This code works good with kernel 2.6.11. After platform_add_devices()
is executed, a_device's probe function will be called first then
b_device's probe function be called.
I merged it to 2.6.17, but strange thing is b_device's probe be called
first always now.
Since b_device probe depends on a_device, so it always fails now.

Does anyone know how to fix it? Or give me the hint about how kernel
decide which device be called to probe in order by what mechanism?

Thanks in advance and best regards,
hufeyy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 002 of 6] Introduce rq_for_each_segment replacing rq_for_each_bio

2007-08-18 Thread Satyam Sharma

On Fri, 17 Aug 2007, Geert Uytterhoeven wrote:

> On Thu, 16 Aug 2007, NeilBrown wrote:
> [...]
> > dev_dbg(>sbd.core,
> > "%s:%u: bio %u: %u segs %u sectors from %lu\n",
> > -   __func__, __LINE__, i, bio_segments(bio),
> > -   bio_sectors(bio), sector);
> > -   bio_for_each_segment(bvec, bio, j) {
> > +   __func__, __LINE__, i, bio_segments(iter.bio),
> > +   bio_sectors(iter.bio),
> > +   (unsigned long)iter.bio->bi_sector);
> ^^^
> Superfluous cast: PS3 is 64-bit only, and casts are evil.

(Sorry for butting in), but I wonder if relying on that in code here
(granted, a PS3-only driver, but it's in drivers/ and not arch/) would
be good style. Why not just print it out as: "%llu" and use an (unsigned
long long) cast, considering that's the largest type sector_t can ever
have (irrespective of arch) ... I can see most of the other generic places
in the kernel (such as in drivers/md/) also using the latter (%llu with
unsigned long long cast) to get bi_sector printed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] qconf: show red links for disabled options

2007-08-18 Thread Marco Costalba

When 'Show debug info' is checked then a list of links
to dependant symbols is shown in info view right bottom pane.

Currently all links are in standard blue. With this patch
links to disabled symbols are shown in red instead.

This, together with 'Show all options', allows to quickly
check out why a given option is hidden.

Signed-off-by: Marco Costalba <[EMAIL PROTECTED]>
---

I understand that color coding could be a poor choice due to people
with color differencing problems.

I chose this anyway to avoid crufting the output with additional signs/symbols,
this could became very ugly in case of long option lists.


  scripts/kconfig/qconf.cc |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/scripts/kconfig/qconf.cc b/scripts/kconfig/qconf.cc
index e4eeb59..e4927c1 100644
--- a/scripts/kconfig/qconf.cc
+++ b/scripts/kconfig/qconf.cc
@@ -1157,9 +1157,14 @@ void ConfigInfoView::expr_print_help(void
*data, struct symbol *sym, const char
QString str2 = print_filter(str);

if (sym && sym->name && !(sym->flags & SYMBOL_CONST)) {
+   bool disabled = (print_filter(sym_get_string_value(sym)) != 
"y");
+   if (disabled)
+   *text += "";
*text += QString().sprintf("", sym);
*text += str2;
*text += "";
+   if (disabled)
+   *text += "";
} else
*text += str2;
 }
-- 
1.5.3.rc4.67.gf9286
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/6] UML - Fix hostfs style

2007-08-18 Thread Satyam Sharma

On Fri, 17 Aug 2007, Jeff Dike wrote:

> Style fixes in hostfs.

> Index: linux-2.6.22/fs/hostfs/hostfs_kern.c
> [...]
> @@ -6,22 +6,15 @@
>   * 2003-02-10 Petr Baudis <[EMAIL PROTECTED]>
>   */
>  
> -#include 
>  #include 
>  #include 
> -#include 
> -#include 
> +#include 
>  #include 
> -#include 
> -#include 
>  #include 
> -#include 
>  #include  /* mark_page_accessed */
> -#include 
>  #include "hostfs.h"
> -#include "kern_util.h"
> -#include "kern.h"
>  #include "init.h"
> +#include "kern.h"

Not really a style fix :-)

> @@ -328,17 +326,17 @@ int hostfs_readdir(struct file *file, vo
> [...]
> - if(error) break;
> + if (error) break;

if (error)
break;

> @@ -522,28 +523,28 @@ static int init_inode(struct inode *inod
> [...]
>   else type = OS_TYPE_DIR;

I wonder what's the generally accepted / followed coding style for this,
actually. Personally I'd prefer:

else
type = OS_TYPE_DIR;

>   else inode->i_op = _iops;

>   else inode->i_fop = _file_fops;

>   else inode->i_mapping->a_ops = _aops;

>   else error = read_name(inode, name);

Ditto.

>   else
>   err = access_file(name, r, w, x);

Here we've used the (different, and preferred IMHO) style. You could
make this style common throughout this file.

> + else if (err > 0) {

This is fine, by the way. "if", or even a "{" in the same line after
"else" is okay, but not a statement by itself.

> Index: linux-2.6.22/fs/hostfs/hostfs_user.c
> [...]
> -#include 
>  #include 
> -#include 
> +#include 
> +#include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "hostfs.h"
> -#include "kern_util.h"
> +#include "os.h"
>  #include "user.h"
> +#include 

Not a style fix again ...

>   else return OS_TYPE_FILE;

>   else return 0;

>   else panic("Impossible mode in open_file");

>   else return fd;

For the "else return" cases, you could consider making the code such that
there's a single return at the end, and a "ret" that is set by the code
appropriately. You'll find counter-examples, sure, but often multiple
"return"s in a function are confusing from a style point of view.

Otherwise, I saw both the patches I was cc'ed on, and both look good
to me, thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Marvell 88E8056 gigabit ethernet controller

2007-08-18 Thread Kevin E

--- Willy Tarreau <[EMAIL PROTECTED]> wrote:

> OK, in this trace, both controllers are on the same
> bus. The broken
> one has 'Capabilities: [100] Advanced Error
> Reporting' the other
> does not have, and the bridge to this bus has two
> more capabilities :
> 'Capabilities: [100] Virtual Channel' and
> 'Capabilities: [180] Unknown (5)'.
> 
> I don't know whether it can jutify a different
> behaviour. Also, maybe this
> is caused by a minuscule difference in the BIOS
> setup ?

I just checked the BIOS' between the two machines and
there was one slight difference.  Working machine I
had serial port turned off and the parallel port setup
differently, so I made changes to the broken machines
BIOS so they are identical.  After making the change I
checked 'lspci -vvv' on the broken one and it's the
same output as before.  

I tested the Marvell interface to see if it made any
difference and the line died after about a second of
transfering data.  dmesg reported this:

sky2 eth1: enabling interface
sky2 eth1: ram buffer 0K
sky2 eth1: Link is up at 100 Mbps, full duplex, flow
control both
sky2 :04:00.0: error interrupt status=0x8000
sky2 eth1: hw error interrupt status 0x8
sky2 eth1: MAC parity error
sky2 :04:00.0: error interrupt status=0x8000
sky2 eth1: hw error interrupt status 0x8
sky2 eth1: MAC parity error
sky2 eth1: disabling interface

Kevin

Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV. 
http://tv.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[rfc patch] firewire: fw-ohci: enforce read order for selfID generation

2007-08-18 Thread Stefan Richter

It seems unlikely, but access to self_id_cpu[0] could at least in theory
be deferred until after the loop over self_id_cpu[1..n] or even after
the subsequent reg_read.  Enforce the desired order by a read barrier.

Also prevent the reg_read from being reordered relative to the for loop.
This isn't necessary if the loop's conditional printk counts as an
implicit barrier, but better make it explicit.

(self_id_cpu[] is a coherent DMA buffer.)

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
---
 drivers/firewire/fw-ohci.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux/drivers/firewire/fw-ohci.c
===
--- linux.orig/drivers/firewire/fw-ohci.c
+++ linux/drivers/firewire/fw-ohci.c
@@ -30,6 +30,7 @@
 
 #include 
 #include 
+#include 
 
 #include "fw-transaction.h"
 #include "fw-ohci.h"
@@ -926,6 +927,7 @@ static void bus_reset_tasklet(unsigned l
 
self_id_count = (reg_read(ohci, OHCI1394_SelfIDCount) >> 3) & 0x3ff;
generation = (le32_to_cpu(ohci->self_id_cpu[0]) >> 16) & 0xff;
+   rmb();
 
for (i = 1, j = 0; j < self_id_count; i += 2, j++) {
if (ohci->self_id_cpu[i] != ~ohci->self_id_cpu[i + 1])
@@ -946,7 +948,7 @@ static void bus_reset_tasklet(unsigned l
 * the two generations match we know we have a consistent set
 * of self IDs.
 */
-
+   barrier();
new_generation = (reg_read(ohci, OHCI1394_SelfIDCount) >> 16) & 0xff;
if (new_generation != generation) {
fw_notify("recursive bus reset detected, "

-- 
Stefan Richter
-=-=-=== =--- =--=-
http://arcgraph.de/sr/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Corrupted filesystem with new Firewire stack

2007-08-18 Thread Stefan Richter

Martin K. Petersen wrote:
>> "Stefan" == Stefan Richter <[EMAIL PROTECTED]> writes:
> 
> Stefan> There were some similar reports involving that "status write
> Stefan> for unknown orb".  I haven't found a way to reproduce it; I
> Stefan> noticed it only once in the logs here so far.
> 
> I get those all the time.  Just do heavy ext3 I/O to the drive.
> 
> Happens here on both a G4 and an intel Mini.  Both running FC7.
> 
> Aug 17 08:24:08 mini kernel: firewire_sbp2: status write for unknown orb
> Aug 17 08:25:08 mini kernel: firewire_sbp2: sbp2_scsi_abort
> Aug 17 08:26:36 mini kernel: firewire_sbp2: status write for unknown orb
> Aug 17 08:27:36 mini kernel: firewire_sbp2: sbp2_scsi_abort
> Aug 17 08:33:51 mini kernel: firewire_sbp2: status write for unknown orb
> Aug 17 08:34:51 mini kernel: firewire_sbp2: sbp2_scsi_abort
> 
> Lacie drive in both cases.

I replaced a HDD in my OFXW911 enclosure and am starting tests now.

While backing ~80 GB from its current reiserfs partition up in order to
reformat to ext3, using find | cpio, I got that error 4 times:

Aug 18 13:24:58 stein ReiserFS: sdd1: Using r5 hash to sort names
Aug 18 13:47:41 stein firewire_sbp2: status write for unknown orb
Aug 18 13:48:11 stein firewire_sbp2: sbp2_scsi_abort
Aug 18 14:18:30 stein firewire_sbp2: status write for unknown orb
Aug 18 14:19:00 stein firewire_sbp2: sbp2_scsi_abort
Aug 18 14:50:08 stein firewire_sbp2: status write for unknown orb
Aug 18 14:50:39 stein firewire_sbp2: sbp2_scsi_abort
Aug 18 14:57:24 stein firewire_sbp2: status write for unknown orb
Aug 18 14:57:54 stein firewire_sbp2: sbp2_scsi_abort

cpio finished with exit code 0, and I compared one of the directories
within which a command was aborted after remounting the disk.  Seems
that error was properly recovered from.

I will test if more of these errors occur in write access or with ext3,
but at least I have a way now to get one error per ~20 minutes
continuous IO, on average.  Then I will proceed to examine the new
stack's sources for potential related bugs.  Will take a while though
because I have other projects going on at the moment.
-- 
Stefan Richter
-=-=-=== =--- =--=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma

On Fri, 17 Aug 2007, Linus Torvalds wrote:

> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
>   if (atomic_read() <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.

True, but that makes this a bad/poor code generation issue with the
compiler, not something that affects the _correctness_ of atomic ops if
"volatile" is used for that counter object (as was suggested), because
we'd always use the atomic_inc() etc primitives to do increments, which
are always (should be!) implemented to be atomic.

> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
>   movl mem,reg
>   cmpl $val,reg
> 
> is any better than
> 
>   cmpl $val,mem
> 
> is just not a very good compiler.

Absolutely, this is definitely a bug report worth opening with gcc. And
what you've said to explain this previously sounds definitely correct --
seeing "volatile" for any access does appear to just scare the hell out
of gcc and makes it generate such (poor) code.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-18 Thread Satyam Sharma

[ LOL, you _are_ shockingly petty! ]

On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > The documentation simply doesn't say "+m" is allowed.  The code to
> > > allow it was added for the benefit of people who do not read the
> > > documentation.  Documentation for "+m" might get added later if it
> > > is decided this [the code, not the documentation] is a sane thing
> > > to have (which isn't directly obvious).
> > 
> > Huh?
> > 
> > "If the (current) documentation doesn't match up with the (current)
> > code, then _at least one_ of them has to be (as of current) wrong."
> > 
> > I wonder how could you even try to disagree with that.
> 
> Easy.
> 
> The GCC documentation you're referring to is the user's manual.
> See the blurb on the first page:
> 
> "This manual documents how to use the GNU compilers, as well as their
> features and incompatibilities, and how to report bugs.  It corresponds
> to GCC version 4.3.0.  The internals of the GNU compilers, including
> how to port them to new targets and some information about how to write
> front ends for new languages, are documented in a separate manual."
> 
> _How to use_.  This documentation doesn't describe in minute detail
> everything the compiler does (see the source code for that -- no, it
> isn't described in the internals manual either).

Wow, now that's a nice "disclaimer". By your (poor) standards of writing
documentation, one can as well write any factually incorrect stuff that
one wants in a document once you've got such a blurb in place :-)

> If it doesn't tell you how to use "+m", and even tells you _not_ to
> use it, maybe that is what it means to say?  It doesn't mean "+m"
> doesn't actually do something.  It also doesn't mean it does what
> you think it should do.  It might do just that of course.  But treating
> writing C code as an empirical science isn't such a smart idea.

Oh, really? Considering how much is (left out of being) documented, often
one would reasonably have to experimentally see (with testcases) how the
compiler behaves for some given code. Well, at least _I_ do it often
(several others on this list do as well), and I think there's everything
smart about it rather than having to read gcc sources -- I'd be surprised
(unless you have infinite free time on your hands, which does look like
teh case actually) if someone actually prefers reading gcc sources first
to know what/how gcc does something for some given code, rather than
simply write it out, compile and look the generated code (saves time for
those who don't have an infinite amount of it).

> > And I didn't go whining about this ... you asked me. (I think I'd said
> > something to the effect of GCC docs are often wrong,
> 
> No need to guess at what you said, even if you managed to delete
> your own mail already, there are plenty of free web-based archives
> around.  You said:
> 
> > See, "volatile" C keyword, for all it's ill-definition and dodgy
> > semantics, is still at least given somewhat of a treatment in the C
> > standard (whose quality is ... ummm, sadly not always good and clear,
> > but unsurprisingly, still about 5,482 orders-of-magnitude times
> > better than GCC docs).

Try _reading_ what I said there, for a change, dude. I'd originally only
said "unless GCC's docs is yet again wrong" ... then _you_ asked me what,
after which this discussion began and I wrote the above [which I fully
agree with -- so what if I used hyperbole in my sentence (yup, that was
intended, and obviously, exaggeration), am I not even allowed to do that?
Man, you're a Nazi or what ...] I didn't go whining about on my own as
you'd had earlier suggested, until _you_ asked me.

[ Ick, I somehow managed to reply this ... this is such a ...
  *disgustingly* petty argument you made here. ]

> > which is true,
> 
> Yes, documentation of that size often has shortcomings.  No surprise
> there.  However, great effort is made to make it better documentation,
> and especially to keep it up to date; if you find any errors or
> omissions, please report them.  There are many ways how to do that,
> see the GCC homepage.
 ^^

Looks like you even get paid :-)

> > but probably you feel saying that is "not allowed" on non-gcc lists?)
> 
> [amazingly pointless stuff snipped]
> 
> > As for the "PR"
> > you're requesting me to file with GCC for this, that
> > gcc-patches@ thread did precisely that
> 
> [more amazingly pointless stuff snipped]
> 
> > and more (submitted a patch to
> > said documentation -- and no, saying "documentation might get added
> > later" is totally bogus and nonsensical -- documentation exists to
> > document current behaviour, not past).
> 
> When code like you want to write becomes a supported feature, that
> will be reflected in the user manual.  It is completely nonsensical
> to expect everything that is *not* a supported feature to be mentioned
> there.

What crap. It is _perfectly reasonable_ to expect (current)

Re: + proc-export-a-processes-resource-limits-via-proc-pid.patch added to -mm tree

2007-08-18 Thread Oleg Nesterov

On 08/18, Neil Horman wrote:
>
> On Sat, Aug 18, 2007 at 02:22:28AM +0400, Oleg Nesterov wrote:
> > Neil Horman wrote:
> > > 
> > > +static int proc_pid_limits(struct task_struct *task, char *buffer)
> > > +{
> > > + unsigned int i;
> > > + int count = 0;
> > > + char *bufptr = buffer;
> > > +
> > > + struct rlimit rlim[RLIM_NLIMITS];
> > > +
> > > + read_lock(_lock);
> > > + memcpy(rlim, task->signal->rlim, sizeof(struct rlimit) * RLIM_NLIMITS);
> > > + read_unlock(_lock);
> > 
> > Please don't re-introduce tasklist_lock unless strictly needed. And in this 
> > case
> > it doesn't help, sys_getrlimit() changes ->rlim[] under task_lock().
> > 
> > Hovewer, I think the whole patch is not right. The "tsk" itself is pinned, 
> > but its
> > ->signal is not stable and can be == NULL.
> > 
> > You can use lock_task_sighand() to access ->signal.
> > 
> You're right about the use of task_lock rather than tasklist_lock in 
> getrlimit,
> but the comment from lock_task_sighand indicates that its use is predicated on
> the prerequisite of locking tasklist_lock,

or rcu_read_lock(),

> so I think the situation is not that
> much of an issue.  From what I see the use of lock_task_sighand is used when
> modifying values in the signal struct, not when removing it entirely (IIRC it
> needs to exist until such time as all sharing processes exit.

yes, it needs to exist until the whole process exits, but no, __exit_signal()
sets ->signal == NULL under sighand->siglock. This btw happens per thread.

> The fact that we
> have an outstanding task struct we are using here guarantees its continued
> existence).

No. proc_info_read() finds a "pid_alive()" task with a valid signal, and bumps
its ->usage. But nothing prevent this thread from exiting (in fact, it may be
already dead), after that the parent can reap that task, or it can reap itself
if it was detached thread.

This means that proc_read() can assume nothing about the task, except that its
task_struct can't disappear.

> Since we are only reading signal->rlimit, which is only written to
> from sys_setrlimit, we should be safe from corrupted limit reads, which at 
> worst
> would cause an erroneous transient data read, rather than any sort of
> panic/crash.

->signal == NULL leads to panic().

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/6] Do not use FASTCALL for __alloc_pages_nodemask()

2007-08-18 Thread Andi Kleen

On Friday 17 August 2007 23:07:33 Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Mel Gorman wrote:
> 
> > Opinions as to why FASTCALL breaks on one machine are welcome.
> 
> Could we get rid of FASTCALL? AFAIK the compiler should automatically 
> choose the right calling convention?

It was a nop for some time because register parameters are always enabled
on i386 and AFAIK no other architectures ever used it. Some out of tree
trees some to disable register parameters though, but that's not 
really a concern.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Jan Engelhardt


On Aug 18 2007 13:31, Måns Rullgård wrote:
>>
>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>> printing eip:
>> c0415974
>> *pde = 
>> Oops:  [#1]
>> SMP last sysfs file: /block/loop7/dev
>> Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
>> autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
>> vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
>> ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
>> i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
>> dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
>> scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
>> uhci_hcd
>> CPU:1
>> EIP:0060:[]Tainted: P  VLI
>> EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
>> smp_send_reschedule+0x3/0x53
>> eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
>> esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
>> ds: 007b   es: 007b   ss: 0068
>> Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
>> Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
>> 000f  0001 0001 c200c6e0 0100 
>> 0069 0180 018fc500 c200d240 0003 0292 f601efc0
>> f6027e00  0050 Call Trace:
>> [] try_to_wake_up+0x351/0x37b
>> [] xfsbufd_wakeup+0x28/0x49 [xfs]
>> [] shrink_slab+0x56/0x13c
[...]
>
>Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>seems to be enough to overflow it.

I think we should include the vermagic string in oopses too, 
so that the flags SMP, PREEMPT, RT, 4KSTACKS, mod_unload, etc. are shown 
and the situation is a bit more apparent.



Jan
--

Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Måns Rullgård

Chris Boot <[EMAIL PROTECTED]> writes:

> All,
>
> I've got a box running RHEL5 and haven't been impressed by ext3
> performance on it (running of a 1.5TB HP MSA20 using the cciss
> driver). I compiled XFS as a module and tried it out since I'm used to
> using it on Debian, which runs much more efficiently. However, every
> so often the kernel panics as below. Apologies for the tainted kernel,
> but we run VMware Server on the box as well.
>
> Does anyone have any hits/tips for using XFS on Red Hat? What's
> causing the panic below, and is there a way around this?
>
> BUG: unable to handle kernel paging request at virtual address b8af9d60
> printing eip:
> c0415974
> *pde = 
> Oops:  [#1]
> SMP last sysfs file: /block/loop7/dev
> Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
> autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
> vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
> ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
> i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
> dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
> scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
> uhci_hcd
> CPU:1
> EIP:0060:[]Tainted: P  VLI
> EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
> smp_send_reschedule+0x3/0x53
> eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
> esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
> ds: 007b   es: 007b   ss: 0068
> Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
> Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
> 000f  0001 0001 c200c6e0 0100 
> 0069 0180 018fc500 c200d240 0003 0292 f601efc0
> f6027e00  0050 Call Trace:
> [] try_to_wake_up+0x351/0x37b
> [] xfsbufd_wakeup+0x28/0x49 [xfs]
> [] shrink_slab+0x56/0x13c
> [] try_to_free_pages+0x162/0x23e
> [] __alloc_pages+0x18d/0x27e
> [] find_or_create_page+0x53/0x8c
> [] __getblk+0x162/0x270
> [] do_lookup+0x53/0x157
> [] ext3_getblk+0x7c/0x233 [ext3]
> [] ext3_getblk+0xeb/0x233 [ext3]
> [] mntput_no_expire+0x11/0x6a
> [] ext3_bread+0x13/0x69 [ext3]
> [] htree_dirblock_to_tree+0x22/0x113 [ext3]
> [] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
> [] do_path_lookup+0x20e/0x25f
> [] get_empty_filp+0x99/0x15e
> [] ext3_permission+0x0/0xa [ext3]
> [] ext3_readdir+0x1ce/0x59b [ext3]
> [] filldir+0x0/0xb9
> [] sys_fstat64+0x1e/0x23
> [] vfs_readdir+0x63/0x8d
> [] filldir+0x0/0xb9
> [] sys_getdents+0x5f/0x9c
> [] syscall_call+0x7/0xb
> ===

Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

-- 
Måns Rullgård
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 >

1 - 100 of 274 matches

Mail list logo