Re: Kernel Development & Objective-C

2007-12-05 Thread Gilboa Davara

On Tue, 2007-12-04 at 12:50 -0500, Lennart Sorensen wrote:
> On Mon, Dec 03, 2007 at 02:35:31PM +0200, Gilboa Davara wrote:
> > Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> > second. (theoretical peak at 1514bytes/frame)
> > Granted, installing such a device on a single CPU/single core machine is
> > absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> > Barcelona) it can still generate ~1M packets/s per core.
> 
> 10GbE can't do 14M packets per second if the packets are 1514 bytes.  At
> 10M packets per second you have less than 1000 bits per packet, which is
> far from 1514bytes.
> 
> 10Gbps gives you at most 1.25GBps, which at 1514 bytes per packet works
> out to 825627 packets per second.  You could reach ~14M packets per
> second with only the smallest packet size, which is rather unusual for
> high throughput traffic, since you waste almost all the bytes on
> overhead in that case.  But you do want to be able to handle at least a
> million or two packets per second to do 10GbE.

... I corrected my math in the second email. [1] 

Never the less, a VOIP network (E.g. G729 and friends) can generate the
maximum number of frames allowed on 10GbE Ethernet which is, AFAIR just
below 15M -per- port. (~29M on a dual port card)

While I doubt that any non-NPU based NIC can handle such a load, on
mixed networks we're already seeing well-above 1M frames per port.

- Gilboa
[1] http://lkml.org/lkml/2007/12/3/69


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development Objective-C

2007-12-05 Thread Gilboa Davara

On Tue, 2007-12-04 at 12:50 -0500, Lennart Sorensen wrote:
 On Mon, Dec 03, 2007 at 02:35:31PM +0200, Gilboa Davara wrote:
  Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
  second. (theoretical peak at 1514bytes/frame)
  Granted, installing such a device on a single CPU/single core machine is
  absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
  Barcelona) it can still generate ~1M packets/s per core.
 
 10GbE can't do 14M packets per second if the packets are 1514 bytes.  At
 10M packets per second you have less than 1000 bits per packet, which is
 far from 1514bytes.
 
 10Gbps gives you at most 1.25GBps, which at 1514 bytes per packet works
 out to 825627 packets per second.  You could reach ~14M packets per
 second with only the smallest packet size, which is rather unusual for
 high throughput traffic, since you waste almost all the bytes on
 overhead in that case.  But you do want to be able to handle at least a
 million or two packets per second to do 10GbE.

... I corrected my math in the second email. [1] 

Never the less, a VOIP network (E.g. G729 and friends) can generate the
maximum number of frames allowed on 10GbE Ethernet which is, AFAIR just
below 15M -per- port. (~29M on a dual port card)

While I doubt that any non-NPU based NIC can handle such a load, on
mixed networks we're already seeing well-above 1M frames per port.

- Gilboa
[1] http://lkml.org/lkml/2007/12/3/69


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development & Objective-C

2007-12-03 Thread Gilboa Davara

On Mon, 2007-12-03 at 14:35 +0200, Gilboa Davara wrote:
> Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> second. (theoretical peak at 1514bytes/frame)
> Granted, installing such a device on a single CPU/single core machine is
> absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> Barcelona) it can still generate ~1M packets/s per core.

Sigh... Sorry. Please ignore the broken math on my part.
Make that 1.8M frames/second per card and ~100K packets/second per core.

- Gilboa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development & Objective-C

2007-12-03 Thread Gilboa Davara

On Mon, 2007-12-03 at 07:12 +0200, Avi Kivity wrote:
> Andi Kleen wrote:
> > Avi Kivity <[EMAIL PROTECTED]> writes:
> >   
> >> [I really doubt there are that many of these; syscall
> >> entry/dispatch/exit, interrupt dispatch, context switch, what else?]
> >> 
> >
> > Networking, block IO, page fault, ... But only the fast paths in these 
> > cases. A lot of the kernel is slow path code and could probably
> > be written even in an interpreted language without much trouble.
> >
> >   
> 
> Even these (with the exception of the page fault path) are hardly "we 
> care about a single instruction" material suggested above.  Even with a 
> million packets per second per core (does such a setup actually exist?)  
> You have a few thousand cycles per packet.  For block you'd need around 
> 5,000 disks per core to reach such rate

Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
second. (theoretical peak at 1514bytes/frame)
Granted, installing such a device on a single CPU/single core machine is
absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
Barcelona) it can still generate ~1M packets/s per core.

Now assuming you're doing low-level (passive) filtering of some sort
(frame/packet routing, traffic interception and/or packet analysis)
using hardware assistance (TSO, complete TCP offloading, etc) is off the
table and each and every cycle within netif_receive_skb (and friends)
-counts-.

I don't suggest that the kernel should be (re)designed for such (niche)
applications but on other hand, if it works...

- Gilboa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development Objective-C

2007-12-03 Thread Gilboa Davara

On Mon, 2007-12-03 at 07:12 +0200, Avi Kivity wrote:
 Andi Kleen wrote:
  Avi Kivity [EMAIL PROTECTED] writes:

  [I really doubt there are that many of these; syscall
  entry/dispatch/exit, interrupt dispatch, context switch, what else?]
  
 
  Networking, block IO, page fault, ... But only the fast paths in these 
  cases. A lot of the kernel is slow path code and could probably
  be written even in an interpreted language without much trouble.
 

 
 Even these (with the exception of the page fault path) are hardly we 
 care about a single instruction material suggested above.  Even with a 
 million packets per second per core (does such a setup actually exist?)  
 You have a few thousand cycles per packet.  For block you'd need around 
 5,000 disks per core to reach such rate

Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
second. (theoretical peak at 1514bytes/frame)
Granted, installing such a device on a single CPU/single core machine is
absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
Barcelona) it can still generate ~1M packets/s per core.

Now assuming you're doing low-level (passive) filtering of some sort
(frame/packet routing, traffic interception and/or packet analysis)
using hardware assistance (TSO, complete TCP offloading, etc) is off the
table and each and every cycle within netif_receive_skb (and friends)
-counts-.

I don't suggest that the kernel should be (re)designed for such (niche)
applications but on other hand, if it works...

- Gilboa

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Development Objective-C

2007-12-03 Thread Gilboa Davara

On Mon, 2007-12-03 at 14:35 +0200, Gilboa Davara wrote:
 Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
 second. (theoretical peak at 1514bytes/frame)
 Granted, installing such a device on a single CPU/single core machine is
 absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
 Barcelona) it can still generate ~1M packets/s per core.

Sigh... Sorry. Please ignore the broken math on my part.
Make that 1.8M frames/second per card and ~100K packets/second per core.

- Gilboa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Redundent/missing code in dik_show_trace (arch/alpha/kernel/traps.c)

2007-09-21 Thread Gilboa Davara
Hello all,

As the title suggest.
"i" is declared as 0 and never assigned a new value.
Down the code there's an if (i>40) and a certain dangling piece of code.
I assume that I should have gotten a string length (?) from somewhere?

Am I missing something?

- Gilboa

---
static void
dik_show_trace(unsigned long *sp)
{
long i = 0;
printk("Trace:\n");
while (0x1ff8 & (unsigned long) sp) {
extern char _stext[], _etext[];
unsigned long tmp = *sp;
sp++;
if (tmp < (unsigned long) &_stext)
continue;
if (tmp >= (unsigned long) &_etext)
continue;
printk("[<%lx>]", tmp);
print_symbol(" %s", tmp);
printk("\n");
if (i > 40) {
printk(" ...");
break;
}
}
printk("\n");
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage. (v3)

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 17:02 +0100, Paulo Marques wrote:
> Gilboa Davara wrote:
> > Hello all,
> 
> Hi, Gilboa
> 
> > (1) Problem:
> > I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
> > i386 kernels, do_IRQ calls dump_stack which, down the path, uses
> > print_symbol (display) and  sprint_symbol (format) to format and display
> > the function name/address/module.
> > Both function use stack based char array (~350 bytes) that, given the
> > initial state (<512 bytes of stack space) may overrun the stack.
> > II. (Comments - previous patches) Using spinlock protected static
> > storage within these functions might block or even deadlock dump_stack
> > (E.g. Crash within dump_stack itself)
> > 
> > (2) Solution:
> > I. Break sprint_symbol into sprint_symbol (API functions; keeps the
> > current interface) and sprint_symbol_helper (helper function with
> > minimal local storage). 
> > II. Replace the char array in __print_symbol with two spinlock protected
> > static char arrays; call the __sprint_symbol helper function instead of
> > sprint_symbol.
> > III. Ignore the spinlock if oops_in_progress is set.
> 
> This is getting more and more convoluted :(
> 
> The problem with the spinlock isn't just that during a panic, we can not 
> trust the kernel structures enough to use spinlocks. It might well 
> happen that lockdep code might want to use print_symbol (and I think it 
> does, so this is not just theoretical) to dump the stack when someone 
> calls spin_lock_irqsave.
> But now, because print_symbol itself uses spin_lock_irqsave, we might 
> get into a recursive situation and a produce a deadlock.
> 
> On the other hand, if you take the other approach of reducing the stack 
> usage by creating a printk_symbol interface, the stack usage would drop 
> from 350 bytes to 128 bytes and your problem would go away entirely.
> 

OK. I'll wait for additional comments while thinking about a third
route. (Following your suggestion)

- Gilboa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 15:21 +0100, Paulo Marques wrote:
> Gilboa Davara wrote:
> > Hello Paulo,
> 
> Hi, Gilboa
> 
> Usually the developer can separate the output by hand in the unlikely 
> case of simultaneous concurrent users of printk, so I don't think this 
> is really a big problem.

In general I agree, but as the number of cores increases dramatically,
hitting a warning/BUG/oops simultaneously on multiple cores (especially
during development) becomes more of a problem.

At least to me, It's not rare that when working on 8C or above machine,
an oops message becomes completely unreadable due to multiple debug
printks that trash the serial port output. 

Never the less, I do agree that this problem is more developer related
and should not hit an end user running a debug-free distribution kernel.

> 
> >> Of course this requires changing _all_ callers of print_symbol to use 
> >> the new interface, but these are less than 100 ;)
> > 
> > This is my first contribution to the Linux kernel. As such I rather
> > start small, and work my way up slowly. (Read: solve the immediate stack
> > over-run now, think about changing the symbol_display interface later)
> 
> Yes, but this is a sensitive area, so you can not just implement 
> something now that can cause regressions, and just fix it later.

True.
My aim is to create a patch that
A. Doesn't break the current interface/APIs while keeping the
patch/noise to a minimum.
B. Does not cause any type of regression.

> 
> Kernel panics are quite rare and the information provided by the stack 
> dump is _extremely_ precious.

I fully agree.

> 
> Even more, risking deadlocks caused by something that should only be 
> used to produce debug information is even worse.

Hopefully the latest patch moves one step further in removing your
concerns.

> 
> >> Comments?
> > 
> > I do agree that the current interface needs work.
> > 
> > ... But as I said, I rather start slowly and on small scale. (Though I
> > did find a rather problematic place to start at... ;))
> 
> Well, if we agree that this is the way to go, then the way to start 
> slowly would be to submit a patch that makes both interfaces available 
> for a while and changes the most stack-critical callers to the new 
> interface.
> 
> Then slowly keep submitting patches to change other callers 
> progressively until there are no more callers of the old interface. At 
> that point we can just drop it entirely.

If I cannot find a working solution that does meet my initial goals
(Minimal patch/no regression) I'll start from scratch - creating a new
trace call path.

- Gilboa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 10:47 -0400, Steven Rostedt wrote:
> On Sat, Sep 15, 2007 at 02:35:29PM +0300, Gilboa Davara wrote:
> > -   return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> > +   if (namebuf)
> > +   kfree(namebuf);
> 
> Note, the if condition is not needed, kfree handles null pointers fine.
> 
> -- Steve

Steve,

This patch has been scrapped and by another,  (hopefully) less
problematic one [1].

Thanks for the comments,
Gilboa

[1] http://lkml.org/lkml/2007/9/21/180




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Reduce __print_symbol/sprint_symbol stack usage. (v3)

2007-09-21 Thread Gilboa Davara
Hello all,

(1) Problem:
I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
i386 kernels, do_IRQ calls dump_stack which, down the path, uses
print_symbol (display) and  sprint_symbol (format) to format and display
the function name/address/module.
Both function use stack based char array (~350 bytes) that, given the
initial state (<512 bytes of stack space) may overrun the stack.
II. (Comments - previous patches) Using spinlock protected static
storage within these functions might block or even deadlock dump_stack
(E.g. Crash within dump_stack itself)

(2) Solution:
I. Break sprint_symbol into sprint_symbol (API functions; keeps the
current interface) and sprint_symbol_helper (helper function with
minimal local storage). 
II. Replace the char array in __print_symbol with two spinlock protected
static char arrays; call the __sprint_symbol helper function instead of
sprint_symbol.
III. Ignore the spinlock if oops_in_progress is set.

(3) Comments:
I. Currently, if oops_in_progress is set, multiple callers can trash
each other.
II. In order to solve it, print_symbol (and/or dump_stack itself) must
be aware of it's own context. (Read: if it's being called by oops, or by
a "debug/warning" code).
III. Possible solutions:
A. Add oops_in_progress_cpuid to bust_spinlocks (Hack, but generates far
less noise).
B. Add priority tag to all the functions that handle the calls stuck.
(Nicer, requires a massive tree-changing patch.)

Signed-off-by: Gilboa Davara <[EMAIL PROTECTED]>

Satyam, Paulo,
 A. I hope this patch better complies with your comments.

- Gilboa
--- linux-2.6/kernel/kallsyms.orig  2007-09-20 19:17:49.0 +0200
+++ linux-2.6/kernel/kallsyms.c 2007-09-21 16:24:03.0 +0200
@@ -306,13 +306,17 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
 }
 
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/*
+ * Helper function:
+ *
+ * Look up a kernel symbol and module name and return them to the
+ *   caller's buffer/namebuf buffers.
+*/
+int sprint_symbol_helper(char *buffer, char *namebuf, unsigned long
address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name;
+   char *modname;
 
name = kallsyms_lookup(address, , , , namebuf);
if (!name)
@@ -325,14 +329,52 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
 }
 
+/*
+ * API function:
+ *
+ * Look up a kernel symbol and return it in a text buffer.
+ */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+   char namebuf[KSYM_NAME_LEN];
+
+   return sprint_symbol_helper(buffer, namebuf, address);
+}
+
+
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
-
-   sprint_symbol(buffer, address);
+   /*
+* Use static buffers instead of char array to reduce the
+*   stack footprint in i386/4KSTACKS.
+* Buffers must be protected against re-entry.
+* ( As long as Oops is not in progress. )
+*/
+   static DEFINE_SPINLOCK(symbol_lock);
+   static char namebuf[KSYM_NAME_LEN];
+   static char buffer[KSYM_SYMBOL_LEN];
+   unsigned long flags;
+   int locked = 0;
+
+
+   /*
+* Bust all spinlocks if oops is in progress.
+* FIXME: Might generate broken output if BUG()/oops
+*   generates a callstack while another one uses
+*   dump_stack for debug purposes.
+*/
+   if (!oops_in_progress) {
+   /* Normal lock mode. */
+   spin_lock_irqsave(_lock, flags);
+   locked = 1;
+   }
 
+   sprint_symbol_helper(buffer, namebuf, address);
printk(fmt, buffer);
+
+   if (locked)
+   spin_unlock_irqrestore(_lock, flags);
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
Hello Paulo,

[snip]
> I must say I agree with Satyam here.
> 
> Locking in the panic path might leave us without some critical debug 
> information, which is much more important than all this.
> 
> Maybe it would be better to change the print_symbol interface to avoid 
> having a "char buffer[KSYM_SYMBOL_LEN];" at all.
> 
> Most print_symbol callers use something like "yada yada %s" as the 
> format string, with an optional "\n" in the end.
> 
> if we change the interface from "print_symbol(fmt, addr)" to 
> "print_symbol(prefix, addr, int newline)" we can simply do:
> 
> printk(prefix);
> printk_symbol(addr);
> if (newline)
>   printk("\n");
> 
> where "printk_symbol" is a new function that does the same as 
> sprint_symbol, but does "printk" instead of "sprintf".
> 
> This should reduce immensely the stack usage of print_symbol without the 
> need for locking.

I fully agree.
... Further more, multiple printk_symbols should be combined into a
single, multi-line printk transaction. (To prevent debug printk's from
trashing a BUG() dump_stack). 

> 
> Of course this requires changing _all_ callers of print_symbol to use 
> the new interface, but these are less than 100 ;)

This is my first contribution to the Linux kernel. As such I rather
start small, and work my way up slowly. (Read: solve the immediate stack
over-run now, think about changing the symbol_display interface later)

> 
> Comments?

I do agree that the current interface needs work.

... But as I said, I rather start slowly and on small scale. (Though I
did find a rather problematic place to start at... ;))

- Gilboa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
Hello Satyam,

On Wed, 2007-09-19 at 06:30 +0530, Satyam Sharma wrote: 
> Hi Gilboa,
> 
> 
> On Sat, 15 Sep 2007, Gilboa Davara wrote:
> > 
> > This is my second stab at solving the "stack over flow due to
> > dump_strace when close to stack-overflow is detected by do_IRQ" problem.
> > (Hopefully) this patch is creates less noise then the previous one.
> > 
> > [snip]
> > > I'll try and create an option 2 (static allocation, minimal locking)
> > > patch and post ASAP.
> > > Hopefully it'll fare better. (While keeping the current interface intact
> > > and reducing the damage/noise)
> > 
> > - Gilboa
> > 
> > --- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
> > +++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.0 +0300
> > @@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
> > return lookup_module_symbol_attrs(addr, size, offset, modname, name);
> >  }
> >  
> > -/* Look up a kernel symbol and return it in a text buffer. */
> > -int sprint_symbol(char *buffer, unsigned long address)
> > +/* Internal version:
> > +   Look up a kernel symbol and module name and return them to the
> > + caller's buffer/namebuf buffers. */
> 
> /*
>  * ...
>  * ...
>  */
> 
> is the general coding style here ...

ACK.
Will be fixed ASAP.

> 
> > +int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
> >  {
> > -   char *modname;
> > -   const char *name;
> > unsigned long offset, size;
> > -   char namebuf[KSYM_NAME_LEN];
> > +   const char *name;
> > +   char *modname;
> >  
> > name = kallsyms_lookup(address, , , , namebuf);
> > if (!name)
> > @@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
> > return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> >  }
> >  
> > +/* Exported version:
> > +   Look up a kernel symbol and return it in a text buffer. */
> 
> ditto.
> 
> > +int sprint_symbol(char *buffer, unsigned long address)
> > +{
> > +   char namebuf[KSYM_NAME_LEN];
> 
> Hmm, don't we intend to push this array out of the stack too?
> 
> + static char namebuf[KSYM_NAME_LEN];
> + static DEFINE_SPINLOCK(namebuf_lock);
> 
> here ?

I did my best to keep the old interface (sprint_symbol), char arrays and
all, fixing only the critical dump_stack path. (Hence, print_symbol no
longer calls sprint_symbol - it calls the __sprint_symbol helper
function instead.

> 
> > +
> > +   return __sprint_symbol(buffer, namebuf, address);
> 
> And you'd need to wrap spin_lock_irqsave()/spin_unlock_irqrestore()
> around this call.

As I'm keeping the old interface (with it's static char arrays) for
non-dump_stack paths, there's no need to use locks around sprint_symbol.

> 
> > +}
> 
> 
> > +static DEFINE_SPINLOCK(symbol_lock);
> 
> Try to keep the declarations of a lock, and the data that it protects,
> close together. Since this lock is being used to protect "buffer", it
> makes sense to ...

ACK. 

> 
> 
> >  /* Look up a kernel symbol and print it to the kernel messages. */
> >  void __print_symbol(const char *fmt, unsigned long address)
> >  {
> > -   char buffer[KSYM_SYMBOL_LEN];
> > +   /* Use static buffers instead of char array to reduce
> > +stack footprint in i386/4KSTACKS.
> > +Buffers must be protected against re-entry. */
> > +   static char namebuf[KSYM_NAME_LEN];
> > +   static char buffer[KSYM_SYMBOL_LEN];
> 
> ... have it:
> 
> + static DEFINE_SPINLOCK(buffer_lock);

ACK.

> 
> here (note the name that exactly describes what the lock protects).
> 
> And the namebuf array isn't required here, it's already there in
> sprint_symbol(), which you can call from ...
> 
> > +   unsigned long flags;
> > +
> >  
> > -   sprint_symbol(buffer, address);
> > +   spin_lock_irqsave(_lock, flags);
> > +
> > +   __sprint_symbol(buffer, namebuf, address);
> 
> here ... sprint_symbol() ?
> 
> > printk(fmt, buffer);
> > +
> > +   spin_unlock_irqrestore(_lock, flags);
> 
> But I still don't much like this :-(

Yep. Still needs work.
... I'll get there. Dead or alive. ;)

> 
> More importantly, if a panic occurs *below* this callchain (and let's
> say we ended up in this callchain because somebody put in a dump_stack()
> somewhere for debugging purposes), then we'd have a deadlock on our hands,
> and nothing gets printed for that panic.

The -best- solution is to use locking in normal circumstances (E.g.
debug code, sleep while atomic ,etc) -but- drop all locks when BUG/Oops
is being hit.

I believe the next patch will take care of this problem.

> 
> I don't know who maintains this part of kernel code, but you can try
> resubmitting (with the changes suggested above) to someone appropriate ...

Any suggestions?

- Gilboa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
Hello Satyam,

On Wed, 2007-09-19 at 06:30 +0530, Satyam Sharma wrote: 
 Hi Gilboa,
 
 
 On Sat, 15 Sep 2007, Gilboa Davara wrote:
  
  This is my second stab at solving the stack over flow due to
  dump_strace when close to stack-overflow is detected by do_IRQ problem.
  (Hopefully) this patch is creates less noise then the previous one.
  
  [snip]
   I'll try and create an option 2 (static allocation, minimal locking)
   patch and post ASAP.
   Hopefully it'll fare better. (While keeping the current interface intact
   and reducing the damage/noise)
  
  - Gilboa
  
  --- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
  +++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.0 +0300
  @@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
  return lookup_module_symbol_attrs(addr, size, offset, modname, name);
   }
   
  -/* Look up a kernel symbol and return it in a text buffer. */
  -int sprint_symbol(char *buffer, unsigned long address)
  +/* Internal version:
  +   Look up a kernel symbol and module name and return them to the
  + caller's buffer/namebuf buffers. */
 
 /*
  * ...
  * ...
  */
 
 is the general coding style here ...

ACK.
Will be fixed ASAP.

 
  +int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
   {
  -   char *modname;
  -   const char *name;
  unsigned long offset, size;
  -   char namebuf[KSYM_NAME_LEN];
  +   const char *name;
  +   char *modname;
   
  name = kallsyms_lookup(address, size, offset, modname, namebuf);
  if (!name)
  @@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
  return sprintf(buffer, %s+%#lx/%#lx, name, offset, size);
   }
   
  +/* Exported version:
  +   Look up a kernel symbol and return it in a text buffer. */
 
 ditto.
 
  +int sprint_symbol(char *buffer, unsigned long address)
  +{
  +   char namebuf[KSYM_NAME_LEN];
 
 Hmm, don't we intend to push this array out of the stack too?
 
 + static char namebuf[KSYM_NAME_LEN];
 + static DEFINE_SPINLOCK(namebuf_lock);
 
 here ?

I did my best to keep the old interface (sprint_symbol), char arrays and
all, fixing only the critical dump_stack path. (Hence, print_symbol no
longer calls sprint_symbol - it calls the __sprint_symbol helper
function instead.

 
  +
  +   return __sprint_symbol(buffer, namebuf, address);
 
 And you'd need to wrap spin_lock_irqsave()/spin_unlock_irqrestore()
 around this call.

As I'm keeping the old interface (with it's static char arrays) for
non-dump_stack paths, there's no need to use locks around sprint_symbol.

 
  +}
 
 
  +static DEFINE_SPINLOCK(symbol_lock);
 
 Try to keep the declarations of a lock, and the data that it protects,
 close together. Since this lock is being used to protect buffer, it
 makes sense to ...

ACK. 

 
 
   /* Look up a kernel symbol and print it to the kernel messages. */
   void __print_symbol(const char *fmt, unsigned long address)
   {
  -   char buffer[KSYM_SYMBOL_LEN];
  +   /* Use static buffers instead of char array to reduce
  +stack footprint in i386/4KSTACKS.
  +Buffers must be protected against re-entry. */
  +   static char namebuf[KSYM_NAME_LEN];
  +   static char buffer[KSYM_SYMBOL_LEN];
 
 ... have it:
 
 + static DEFINE_SPINLOCK(buffer_lock);

ACK.

 
 here (note the name that exactly describes what the lock protects).
 
 And the namebuf array isn't required here, it's already there in
 sprint_symbol(), which you can call from ...
 
  +   unsigned long flags;
  +
   
  -   sprint_symbol(buffer, address);
  +   spin_lock_irqsave(symbol_lock, flags);
  +
  +   __sprint_symbol(buffer, namebuf, address);
 
 here ... sprint_symbol() ?
 
  printk(fmt, buffer);
  +
  +   spin_unlock_irqrestore(symbol_lock, flags);
 
 But I still don't much like this :-(

Yep. Still needs work.
... I'll get there. Dead or alive. ;)

 
 More importantly, if a panic occurs *below* this callchain (and let's
 say we ended up in this callchain because somebody put in a dump_stack()
 somewhere for debugging purposes), then we'd have a deadlock on our hands,
 and nothing gets printed for that panic.

The -best- solution is to use locking in normal circumstances (E.g.
debug code, sleep while atomic ,etc) -but- drop all locks when BUG/Oops
is being hit.

I believe the next patch will take care of this problem.

 
 I don't know who maintains this part of kernel code, but you can try
 resubmitting (with the changes suggested above) to someone appropriate ...

Any suggestions?

- Gilboa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
Hello Paulo,

[snip]
 I must say I agree with Satyam here.
 
 Locking in the panic path might leave us without some critical debug 
 information, which is much more important than all this.
 
 Maybe it would be better to change the print_symbol interface to avoid 
 having a char buffer[KSYM_SYMBOL_LEN]; at all.
 
 Most print_symbol callers use something like yada yada %s as the 
 format string, with an optional \n in the end.
 
 if we change the interface from print_symbol(fmt, addr) to 
 print_symbol(prefix, addr, int newline) we can simply do:
 
 printk(prefix);
 printk_symbol(addr);
 if (newline)
   printk(\n);
 
 where printk_symbol is a new function that does the same as 
 sprint_symbol, but does printk instead of sprintf.
 
 This should reduce immensely the stack usage of print_symbol without the 
 need for locking.

I fully agree.
... Further more, multiple printk_symbols should be combined into a
single, multi-line printk transaction. (To prevent debug printk's from
trashing a BUG() dump_stack). 

 
 Of course this requires changing _all_ callers of print_symbol to use 
 the new interface, but these are less than 100 ;)

This is my first contribution to the Linux kernel. As such I rather
start small, and work my way up slowly. (Read: solve the immediate stack
over-run now, think about changing the symbol_display interface later)

 
 Comments?

I do agree that the current interface needs work.

... But as I said, I rather start slowly and on small scale. (Though I
did find a rather problematic place to start at... ;))

- Gilboa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Reduce __print_symbol/sprint_symbol stack usage. (v3)

2007-09-21 Thread Gilboa Davara
Hello all,

(1) Problem:
I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
i386 kernels, do_IRQ calls dump_stack which, down the path, uses
print_symbol (display) and  sprint_symbol (format) to format and display
the function name/address/module.
Both function use stack based char array (~350 bytes) that, given the
initial state (512 bytes of stack space) may overrun the stack.
II. (Comments - previous patches) Using spinlock protected static
storage within these functions might block or even deadlock dump_stack
(E.g. Crash within dump_stack itself)

(2) Solution:
I. Break sprint_symbol into sprint_symbol (API functions; keeps the
current interface) and sprint_symbol_helper (helper function with
minimal local storage). 
II. Replace the char array in __print_symbol with two spinlock protected
static char arrays; call the __sprint_symbol helper function instead of
sprint_symbol.
III. Ignore the spinlock if oops_in_progress is set.

(3) Comments:
I. Currently, if oops_in_progress is set, multiple callers can trash
each other.
II. In order to solve it, print_symbol (and/or dump_stack itself) must
be aware of it's own context. (Read: if it's being called by oops, or by
a debug/warning code).
III. Possible solutions:
A. Add oops_in_progress_cpuid to bust_spinlocks (Hack, but generates far
less noise).
B. Add priority tag to all the functions that handle the calls stuck.
(Nicer, requires a massive tree-changing patch.)

Signed-off-by: Gilboa Davara [EMAIL PROTECTED]

Satyam, Paulo,
 A. I hope this patch better complies with your comments.

- Gilboa
--- linux-2.6/kernel/kallsyms.orig  2007-09-20 19:17:49.0 +0200
+++ linux-2.6/kernel/kallsyms.c 2007-09-21 16:24:03.0 +0200
@@ -306,13 +306,17 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
 }
 
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/*
+ * Helper function:
+ *
+ * Look up a kernel symbol and module name and return them to the
+ *   caller's buffer/namebuf buffers.
+*/
+int sprint_symbol_helper(char *buffer, char *namebuf, unsigned long
address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name;
+   char *modname;
 
name = kallsyms_lookup(address, size, offset, modname, namebuf);
if (!name)
@@ -325,14 +329,52 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, %s+%#lx/%#lx, name, offset, size);
 }
 
+/*
+ * API function:
+ *
+ * Look up a kernel symbol and return it in a text buffer.
+ */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+   char namebuf[KSYM_NAME_LEN];
+
+   return sprint_symbol_helper(buffer, namebuf, address);
+}
+
+
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
-
-   sprint_symbol(buffer, address);
+   /*
+* Use static buffers instead of char array to reduce the
+*   stack footprint in i386/4KSTACKS.
+* Buffers must be protected against re-entry.
+* ( As long as Oops is not in progress. )
+*/
+   static DEFINE_SPINLOCK(symbol_lock);
+   static char namebuf[KSYM_NAME_LEN];
+   static char buffer[KSYM_SYMBOL_LEN];
+   unsigned long flags;
+   int locked = 0;
+
+
+   /*
+* Bust all spinlocks if oops is in progress.
+* FIXME: Might generate broken output if BUG()/oops
+*   generates a callstack while another one uses
+*   dump_stack for debug purposes.
+*/
+   if (!oops_in_progress) {
+   /* Normal lock mode. */
+   spin_lock_irqsave(symbol_lock, flags);
+   locked = 1;
+   }
 
+   sprint_symbol_helper(buffer, namebuf, address);
printk(fmt, buffer);
+
+   if (locked)
+   spin_unlock_irqrestore(symbol_lock, flags);
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 10:47 -0400, Steven Rostedt wrote:
 On Sat, Sep 15, 2007 at 02:35:29PM +0300, Gilboa Davara wrote:
  -   return sprintf(buffer, %s+%#lx/%#lx, name, offset, size);
  +   if (namebuf)
  +   kfree(namebuf);
 
 Note, the if condition is not needed, kfree handles null pointers fine.
 
 -- Steve

Steve,

This patch has been scrapped and by another,  (hopefully) less
problematic one [1].

Thanks for the comments,
Gilboa

[1] http://lkml.org/lkml/2007/9/21/180




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 15:21 +0100, Paulo Marques wrote:
 Gilboa Davara wrote:
  Hello Paulo,
 
 Hi, Gilboa
 
 Usually the developer can separate the output by hand in the unlikely 
 case of simultaneous concurrent users of printk, so I don't think this 
 is really a big problem.

In general I agree, but as the number of cores increases dramatically,
hitting a warning/BUG/oops simultaneously on multiple cores (especially
during development) becomes more of a problem.

At least to me, It's not rare that when working on 8C or above machine,
an oops message becomes completely unreadable due to multiple debug
printks that trash the serial port output. 

Never the less, I do agree that this problem is more developer related
and should not hit an end user running a debug-free distribution kernel.

 
  Of course this requires changing _all_ callers of print_symbol to use 
  the new interface, but these are less than 100 ;)
  
  This is my first contribution to the Linux kernel. As such I rather
  start small, and work my way up slowly. (Read: solve the immediate stack
  over-run now, think about changing the symbol_display interface later)
 
 Yes, but this is a sensitive area, so you can not just implement 
 something now that can cause regressions, and just fix it later.

True.
My aim is to create a patch that
A. Doesn't break the current interface/APIs while keeping the
patch/noise to a minimum.
B. Does not cause any type of regression.

 
 Kernel panics are quite rare and the information provided by the stack 
 dump is _extremely_ precious.

I fully agree.

 
 Even more, risking deadlocks caused by something that should only be 
 used to produce debug information is even worse.

Hopefully the latest patch moves one step further in removing your
concerns.

 
  Comments?
  
  I do agree that the current interface needs work.
  
  ... But as I said, I rather start slowly and on small scale. (Though I
  did find a rather problematic place to start at... ;))
 
 Well, if we agree that this is the way to go, then the way to start 
 slowly would be to submit a patch that makes both interfaces available 
 for a while and changes the most stack-critical callers to the new 
 interface.
 
 Then slowly keep submitting patches to change other callers 
 progressively until there are no more callers of the old interface. At 
 that point we can just drop it entirely.

If I cannot find a working solution that does meet my initial goals
(Minimal patch/no regression) I'll start from scratch - creating a new
trace call path.

- Gilboa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage. (v3)

2007-09-21 Thread Gilboa Davara
On Fri, 2007-09-21 at 17:02 +0100, Paulo Marques wrote:
 Gilboa Davara wrote:
  Hello all,
 
 Hi, Gilboa
 
  (1) Problem:
  I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
  i386 kernels, do_IRQ calls dump_stack which, down the path, uses
  print_symbol (display) and  sprint_symbol (format) to format and display
  the function name/address/module.
  Both function use stack based char array (~350 bytes) that, given the
  initial state (512 bytes of stack space) may overrun the stack.
  II. (Comments - previous patches) Using spinlock protected static
  storage within these functions might block or even deadlock dump_stack
  (E.g. Crash within dump_stack itself)
  
  (2) Solution:
  I. Break sprint_symbol into sprint_symbol (API functions; keeps the
  current interface) and sprint_symbol_helper (helper function with
  minimal local storage). 
  II. Replace the char array in __print_symbol with two spinlock protected
  static char arrays; call the __sprint_symbol helper function instead of
  sprint_symbol.
  III. Ignore the spinlock if oops_in_progress is set.
 
 This is getting more and more convoluted :(
 
 The problem with the spinlock isn't just that during a panic, we can not 
 trust the kernel structures enough to use spinlocks. It might well 
 happen that lockdep code might want to use print_symbol (and I think it 
 does, so this is not just theoretical) to dump the stack when someone 
 calls spin_lock_irqsave.
 But now, because print_symbol itself uses spin_lock_irqsave, we might 
 get into a recursive situation and a produce a deadlock.
 
 On the other hand, if you take the other approach of reducing the stack 
 usage by creating a printk_symbol interface, the stack usage would drop 
 from 350 bytes to 128 bytes and your problem would go away entirely.
 

OK. I'll wait for additional comments while thinking about a third
route. (Following your suggestion)

- Gilboa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Redundent/missing code in dik_show_trace (arch/alpha/kernel/traps.c)

2007-09-21 Thread Gilboa Davara
Hello all,

As the title suggest.
i is declared as 0 and never assigned a new value.
Down the code there's an if (i40) and a certain dangling piece of code.
I assume that I should have gotten a string length (?) from somewhere?

Am I missing something?

- Gilboa

---
static void
dik_show_trace(unsigned long *sp)
{
long i = 0;
printk(Trace:\n);
while (0x1ff8  (unsigned long) sp) {
extern char _stext[], _etext[];
unsigned long tmp = *sp;
sp++;
if (tmp  (unsigned long) _stext)
continue;
if (tmp = (unsigned long) _etext)
continue;
printk([%lx], tmp);
print_symbol( %s, tmp);
printk(\n);
if (i  40) {
printk( ...);
break;
}
}
printk(\n);
}


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
Hello all, Satyam,

This is my second stab at solving the "stack over flow due to
dump_strace when close to stack-overflow is detected by do_IRQ" problem.
(Hopefully) this patch is creates less noise then the previous one.

[snip]
> I'll try and create an option 2 (static allocation, minimal locking)
> patch and post ASAP.
> Hopefully it'll fare better. (While keeping the current interface intact
> and reducing the damage/noise)
> 
> - Gilboa

- Gilboa

--- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.0 +0300
@@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
 }
 
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/* Internal version:
+   Look up a kernel symbol and module name and return them to the
+ caller's buffer/namebuf buffers. */
+int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name;
+   char *modname;
 
name = kallsyms_lookup(address, , , , namebuf);
if (!name)
@@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
 }
 
+/* Exported version:
+   Look up a kernel symbol and return it in a text buffer. */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+   char namebuf[KSYM_NAME_LEN];
+
+   return __sprint_symbol(buffer, namebuf, address);
+}
+
+static DEFINE_SPINLOCK(symbol_lock);
+
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
+   /* Use static buffers instead of char array to reduce
+stack footprint in i386/4KSTACKS.
+Buffers must be protected against re-entry. */
+   static char namebuf[KSYM_NAME_LEN];
+   static char buffer[KSYM_SYMBOL_LEN];
+   unsigned long flags;
+
 
-   sprint_symbol(buffer, address);
+   spin_lock_irqsave(_lock, flags);
+
+   __sprint_symbol(buffer, namebuf, address);
 
printk(fmt, buffer);
+
+   spin_unlock_irqrestore(_lock, flags);
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
On Sat, 2007-09-15 at 18:32 +0530, Satyam Sharma wrote:
> Hi,
> 
> 
> On 9/15/07, Gilboa Davara <[EMAIL PROTECTED]> wrote:
> > Hello all,
> >
> > In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
> > out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
> > enabled. (Though not limited to it)
> 
> Yeah, I have experienced this phenomenon/problem myself.
> 
> 
> > Code path is simple: do_IRQ detects a a near stack overflow condition
> > and calls show_trace_log_lvl which, down the line uses __print_symbol
> > and sprint_symbol to print the call stack.
> > However,  both __print_symbol + sprint_symbol are eating no-less then
> > 128+223 bytes on static char arrays, which, given the fact that this
> > code path is actually generated by low stack warning (< 512 bytes),
> > might turn a minor (?) problem (low stack) into a full blown crash.
> 
> __print_symbol() and sprint_symbol() are called multiple times during
> oopsen / panics. I think those buffers were static char arrays for a good
> reason ...

OK. Point taken.
I pull this patch pending some additional thinking.

> > The patch itself is fairly simple and non-intrusive. [2]
> > Both functions allocate memory for their buffers - falling back to
> > minimal address display if memory allocation fails.
> >
> > P.S. Can anyone please point me to the maintainer of kernel/syms? (I
> > rather not spam world + dog for such a minor patch)
> 
> Anything that touches the panic codepath is important, not minor at all.

Bad wording on my part.
By minor I meant, changes a single source file, doesn't change
interfaces, doesn't change code behavior beyond it's local scope.

> > [2]. In theory, there's a second option: pre-allocating memory on a
> > per_cpu basis, however:
> > A. dump_trace/stack are usually called when something bad has happened -
> > reducing the need for performance optimizations.
> 
> That's not a performance optimization -- avoiding repeated kmalloc()'s in the
> panic codepath sounds like a *requirement* to me.

ACK.

Though in my defense, solution [2] requires a massive surgery that would
have made this patch far more intrusive.

> 
> 
> > B. per_cpu allocation will also require local_irq_disable/enable as both
> > functions are being called from multiple contexts. Too much hassle.
> 
> I think not bothering about any locking in these codepaths may not be an
> entirely unreasonable thing to do (sorry about the triple negation in the
> sentence). What I mean is that there are places in these codepaths where
> we already don't bother with locking ...
> 
> Overall I don't much like introducing kmalloc(GFP_ATOMIC) in these codepaths
> and would ask you guys to consider some other pre-allocation (i.e. static
> allocation not on stack but in .data) alternative instead ...
> 

> Satyam

No locking what-so-ever is a bad idea. dump_stack/trace are being called
by non-fatal sources (sleep while atomic; stack-check; debugging) that
may produce problematic results if a static/shared buffer is being used
with no locks.
We can agree that using in-stack char buffer is very problematic -
especially given the fact that 4K is becoming the default build option.

I'll try and create an option 2 (static allocation, minimal locking)
patch and post ASAP.
Hopefully it'll fare better. (While keeping the current interface intact
and reducing the damage/noise)

- Gilboa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
Hello all,

In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
enabled. (Though not limited to it)

Code path is simple: do_IRQ detects a a near stack overflow condition
and calls show_trace_log_lvl which, down the line uses __print_symbol
and sprint_symbol to print the call stack.
However,  both __print_symbol + sprint_symbol are eating no-less then
128+223 bytes on static char arrays, which, given the fact that this
code path is actually generated by low stack warning (< 512 bytes),
might turn a minor (?) problem (low stack) into a full blown crash.

The patch itself is fairly simple and non-intrusive. [2]
Both functions allocate memory for their buffers - falling back to
minimal address display if memory allocation fails.

P.S. Can anyone please point me to the maintainer of kernel/syms? (I
rather not spam world + dog for such a minor patch)

-- 
Gilboa Davara <[EMAIL PROTECTED]>

[1]
http://www.mail-archive.com/[EMAIL PROTECTED]/msg00640.html

[2]. In theory, there's a second option: pre-allocating memory on a
per_cpu basis, however:
A. dump_trace/stack are usually called when something bad has happened -
reducing the need for performance optimizations.
B. per_cpu allocation will also require local_irq_disable/enable as both
functions are being called from multiple contexts. Too much hassle.


--- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 14:25:21.0 +0300
@@ -309,30 +309,62 @@ int lookup_symbol_attrs(unsigned long ad
 /* Look up a kernel symbol and return it in a text buffer. */
 int sprint_symbol(char *buffer, unsigned long address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name = NULL;
+   char *namebuf = NULL;
+   char *modname;
+   int ret;
+
+
+   /* Static buffer allocation.
+  Required in-order to reduce stack footprint on
+do_IRQ/4KSTACK/i386 */
+   namebuf = kmalloc(KSYM_NAME_LEN, GFP_ATOMIC);
+   if (namebuf)
+   name = kallsyms_lookup(address, , ,
+   , namebuf);
 
-   name = kallsyms_lookup(address, , , , namebuf);
if (!name)
-   return sprintf(buffer, "0x%lx", address);
+   ret = sprintf(buffer, "0x%lx", address);
+   else {
+   if (modname)
+   ret = sprintf(buffer, "%s+%#lx/%#lx [%s]",
+   name, offset, size, modname);
+   else
+   ret = sprintf(buffer, "%s+%#lx/%#lx",
+   name, offset, size);
+   }
 
-   if (modname)
-   return sprintf(buffer, "%s+%#lx/%#lx [%s]", name, offset,
-   size, modname);
-   else
-   return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
+   if (namebuf)
+   kfree(namebuf);
+
+   return ret;
 }
 
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
+   char *buffer = NULL;
 
-   sprint_symbol(buffer, address);
 
-   printk(fmt, buffer);
+   /* Static buffer allocation.
+  Required in-order to reduce stack footprint on
+do_IRQ/4KSTACK/i386 */
+   buffer = kmalloc(KSYM_SYMBOL_LEN, GFP_ATOMIC);
+   if (buffer) {
+   sprint_symbol(buffer, address);
+   printk(fmt, buffer);
+   kfree(buffer);
+   } else {
+   /* Address + '0x' + NULL. */
+   char sbuffer[(BITS_PER_LONG / 4) + 3];
+
+   /* Fall-back mode.
+  Memory allocation failed.
+  Convert the address to string and display it. */
+   sprintf(sbuffer, "0x%lx", address);
+   printk(fmt, sbuffer);
+   }
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
Hello all,

In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
enabled. (Though not limited to it)

Code path is simple: do_IRQ detects a a near stack overflow condition
and calls show_trace_log_lvl which, down the line uses __print_symbol
and sprint_symbol to print the call stack.
However,  both __print_symbol + sprint_symbol are eating no-less then
128+223 bytes on static char arrays, which, given the fact that this
code path is actually generated by low stack warning ( 512 bytes),
might turn a minor (?) problem (low stack) into a full blown crash.

The patch itself is fairly simple and non-intrusive. [2]
Both functions allocate memory for their buffers - falling back to
minimal address display if memory allocation fails.

P.S. Can anyone please point me to the maintainer of kernel/syms? (I
rather not spam world + dog for such a minor patch)

-- 
Gilboa Davara [EMAIL PROTECTED]

[1]
http://www.mail-archive.com/[EMAIL PROTECTED]/msg00640.html

[2]. In theory, there's a second option: pre-allocating memory on a
per_cpu basis, however:
A. dump_trace/stack are usually called when something bad has happened -
reducing the need for performance optimizations.
B. per_cpu allocation will also require local_irq_disable/enable as both
functions are being called from multiple contexts. Too much hassle.


--- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 14:25:21.0 +0300
@@ -309,30 +309,62 @@ int lookup_symbol_attrs(unsigned long ad
 /* Look up a kernel symbol and return it in a text buffer. */
 int sprint_symbol(char *buffer, unsigned long address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name = NULL;
+   char *namebuf = NULL;
+   char *modname;
+   int ret;
+
+
+   /* Static buffer allocation.
+  Required in-order to reduce stack footprint on
+do_IRQ/4KSTACK/i386 */
+   namebuf = kmalloc(KSYM_NAME_LEN, GFP_ATOMIC);
+   if (namebuf)
+   name = kallsyms_lookup(address, size, offset,
+   modname, namebuf);
 
-   name = kallsyms_lookup(address, size, offset, modname, namebuf);
if (!name)
-   return sprintf(buffer, 0x%lx, address);
+   ret = sprintf(buffer, 0x%lx, address);
+   else {
+   if (modname)
+   ret = sprintf(buffer, %s+%#lx/%#lx [%s],
+   name, offset, size, modname);
+   else
+   ret = sprintf(buffer, %s+%#lx/%#lx,
+   name, offset, size);
+   }
 
-   if (modname)
-   return sprintf(buffer, %s+%#lx/%#lx [%s], name, offset,
-   size, modname);
-   else
-   return sprintf(buffer, %s+%#lx/%#lx, name, offset, size);
+   if (namebuf)
+   kfree(namebuf);
+
+   return ret;
 }
 
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
+   char *buffer = NULL;
 
-   sprint_symbol(buffer, address);
 
-   printk(fmt, buffer);
+   /* Static buffer allocation.
+  Required in-order to reduce stack footprint on
+do_IRQ/4KSTACK/i386 */
+   buffer = kmalloc(KSYM_SYMBOL_LEN, GFP_ATOMIC);
+   if (buffer) {
+   sprint_symbol(buffer, address);
+   printk(fmt, buffer);
+   kfree(buffer);
+   } else {
+   /* Address + '0x' + NULL. */
+   char sbuffer[(BITS_PER_LONG / 4) + 3];
+
+   /* Fall-back mode.
+  Memory allocation failed.
+  Convert the address to string and display it. */
+   sprintf(sbuffer, 0x%lx, address);
+   printk(fmt, sbuffer);
+   }
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Minor patch] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
On Sat, 2007-09-15 at 18:32 +0530, Satyam Sharma wrote:
 Hi,
 
 
 On 9/15/07, Gilboa Davara [EMAIL PROTECTED] wrote:
  Hello all,
 
  In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
  out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
  enabled. (Though not limited to it)
 
 Yeah, I have experienced this phenomenon/problem myself.
 
 
  Code path is simple: do_IRQ detects a a near stack overflow condition
  and calls show_trace_log_lvl which, down the line uses __print_symbol
  and sprint_symbol to print the call stack.
  However,  both __print_symbol + sprint_symbol are eating no-less then
  128+223 bytes on static char arrays, which, given the fact that this
  code path is actually generated by low stack warning ( 512 bytes),
  might turn a minor (?) problem (low stack) into a full blown crash.
 
 __print_symbol() and sprint_symbol() are called multiple times during
 oopsen / panics. I think those buffers were static char arrays for a good
 reason ...

OK. Point taken.
I pull this patch pending some additional thinking.

  The patch itself is fairly simple and non-intrusive. [2]
  Both functions allocate memory for their buffers - falling back to
  minimal address display if memory allocation fails.
 
  P.S. Can anyone please point me to the maintainer of kernel/syms? (I
  rather not spam world + dog for such a minor patch)
 
 Anything that touches the panic codepath is important, not minor at all.

Bad wording on my part.
By minor I meant, changes a single source file, doesn't change
interfaces, doesn't change code behavior beyond it's local scope.

  [2]. In theory, there's a second option: pre-allocating memory on a
  per_cpu basis, however:
  A. dump_trace/stack are usually called when something bad has happened -
  reducing the need for performance optimizations.
 
 That's not a performance optimization -- avoiding repeated kmalloc()'s in the
 panic codepath sounds like a *requirement* to me.

ACK.

Though in my defense, solution [2] requires a massive surgery that would
have made this patch far more intrusive.

 
 
  B. per_cpu allocation will also require local_irq_disable/enable as both
  functions are being called from multiple contexts. Too much hassle.
 
 I think not bothering about any locking in these codepaths may not be an
 entirely unreasonable thing to do (sorry about the triple negation in the
 sentence). What I mean is that there are places in these codepaths where
 we already don't bother with locking ...
 
 Overall I don't much like introducing kmalloc(GFP_ATOMIC) in these codepaths
 and would ask you guys to consider some other pre-allocation (i.e. static
 allocation not on stack but in .data) alternative instead ...
 

 Satyam

No locking what-so-ever is a bad idea. dump_stack/trace are being called
by non-fatal sources (sleep while atomic; stack-check; debugging) that
may produce problematic results if a static/shared buffer is being used
with no locks.
We can agree that using in-stack char buffer is very problematic -
especially given the fact that 4K is becoming the default build option.

I'll try and create an option 2 (static allocation, minimal locking)
patch and post ASAP.
Hopefully it'll fare better. (While keeping the current interface intact
and reducing the damage/noise)

- Gilboa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reduce __print_symbol/sprint_symbol stack usage.

2007-09-15 Thread Gilboa Davara
Hello all, Satyam,

This is my second stab at solving the stack over flow due to
dump_strace when close to stack-overflow is detected by do_IRQ problem.
(Hopefully) this patch is creates less noise then the previous one.

[snip]
 I'll try and create an option 2 (static allocation, minimal locking)
 patch and post ASAP.
 Hopefully it'll fare better. (While keeping the current interface intact
 and reducing the damage/noise)
 
 - Gilboa

- Gilboa

--- linux-2.6/kernel/kallsyms.orig  2007-09-15 11:46:54.0 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.0 +0300
@@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
 }
 
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/* Internal version:
+   Look up a kernel symbol and module name and return them to the
+ caller's buffer/namebuf buffers. */
+int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
 {
-   char *modname;
-   const char *name;
unsigned long offset, size;
-   char namebuf[KSYM_NAME_LEN];
+   const char *name;
+   char *modname;
 
name = kallsyms_lookup(address, size, offset, modname, namebuf);
if (!name)
@@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, %s+%#lx/%#lx, name, offset, size);
 }
 
+/* Exported version:
+   Look up a kernel symbol and return it in a text buffer. */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+   char namebuf[KSYM_NAME_LEN];
+
+   return __sprint_symbol(buffer, namebuf, address);
+}
+
+static DEFINE_SPINLOCK(symbol_lock);
+
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
-   char buffer[KSYM_SYMBOL_LEN];
+   /* Use static buffers instead of char array to reduce
+stack footprint in i386/4KSTACKS.
+Buffers must be protected against re-entry. */
+   static char namebuf[KSYM_NAME_LEN];
+   static char buffer[KSYM_SYMBOL_LEN];
+   unsigned long flags;
+
 
-   sprint_symbol(buffer, address);
+   spin_lock_irqsave(symbol_lock, flags);
+
+   __sprint_symbol(buffer, namebuf, address);
 
printk(fmt, buffer);
+
+   spin_unlock_irqrestore(symbol_lock, flags);
 }
 
 /* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/