RE: [PATCH] Re: reliability of linux-vm subsystem

2000-11-14 Thread Chris Swiedler

> > Good, so the OOM killer works.
>
> But it doesn't work for this kind of application misbehaviours (or
> user attacks):
>
> main() { while(1) if (fork()) malloc(1); }

This seems to be a fork() bomb, not a VM issue. The system is overwhelmed by
the the forks, not by the space consumed by the allocations themselves. For
one thing, I've found that

main() { while(1) malloc(1024*1024); }

does not kill your system very quickly (if at all). Without actually writing
to the memory, it doesn't seem to be "really" allocated. Adding a memset()
will kill your system much more quickly.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] bugfix in oom_kill.c

2000-11-14 Thread Chris Swiedler

This patch fixes a bug in oom_kill. The way it was written, the OOM killer
would try to kill the idle task if the task selected immediately before it
had the most "badness". Probably because of the order of for_each_task(),
this wouldn't ever happen, but I don't think we want to depend on that.

chris

--- official/linux-2.4.0/mm/oom_kill.c  Mon Nov  6 23:53:01 2000
+++ work/linux-2.4.0-test10/mm/oom_kill.c   Thu Nov  9 23:12:10 2000
@@ -124,11 +143,12 @@
read_lock(&tasklist_lock);
for_each_task(p)
{
-   if (p->pid)
+   if (p->pid) {
points = badness(p);
-   if (points > maxpoints) {
-   chosen = p;
-   maxpoints = points;
+   if (points > maxpoints) {
+   chosen = p;
+   maxpoints = points;
+   }
}
}
read_unlock(&tasklist_lock);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



News gateway not working

2000-09-19 Thread Chris Swiedler

I didn't want to post to the list with this, but [EMAIL PROTECTED]
didn't get a reply.

The NNTP gateway hasn't been working for two weeks-- the last list message
was 9/2/2000. Maybe this is common knowledge on the list (since I'm not
subscribed, I obviously wouldn't know...) but it's a little discouraging to
get absolute silence on fa.linux.kernel. If the problem is known, then
someone should post directly to the newsgroup and let us know when it might
be fixed. I know there were some massive changes when the list switched to
new servers, and hopefully that's the reason and it can be fixed soon.

For many people, the newsgroup is a much easier way to read the list, and
furthermore it reduces the load on vger, because otherwise we'd have to
subscribe. So please...look kindly on us.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Linux's implementation of poll() not scalable?

2000-10-27 Thread Chris Swiedler

> It doesn't practically matter how efficient the X server is when
> you aren't busy, after all.

A simple polling scheme (i.e. not using poll() or select(), just looping
through all fd's trying nonblocking reads) is perfectly efficient when the
server is 100% busy, and perfectly inefficient when there is nothing to do.
I'm not saying that your statements are wrong--in your example, X is calling
select() which is not wasting as much time as a hard-polling loop--but it's
wrong to say that high-load efficiency is the primary concern. I would be
horrified if X took a signifigant portion of the CPU time when many clients
were connected, but none were actually doing anything.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



include fb.h from userland?

2000-11-03 Thread Chris Swiedler

I understand that the headers in /usr/include/linux shouldn't be overwritten
by new kernel installs. But can someone elaborate on Linus's original
admonition
(http://kernelnotes.org/lnxlists/linux-kernel/lk_0007_04/msg00881.html)? Am
I never, ever, ever allowed to update my system headers for the rest of my
life, or is it only if I follow some particular procedure, such as
recompiling glibc?

The reason I want to upgrade my system headers is that framebuffer
development requires linux/fb.h to be included from userland (I see no way
around that). The version of fb.h in my system headers is 2.2.5, the distro
version I originally installed. I'm running 2.2.17 kernel now, which has
much newer fb.h which I need.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] protect processes from OOM killer

2000-11-07 Thread Chris Swiedler

Here's a small patch to allow a user to protect certain PIDs from death-
by-OOM-killer. It uses the proc entry '/proc/sys/vm/oom_protect'; echo the
PIDs to be protected:

echo 1 516 > /proc/sys/vm/oom_protect

The idea is that sysadmins can mark some daemon processes as off-limits for
the OOM killer. Stuff like syslogd, init, etc. Incidentally, this answers
Andrea's concern about the init process getting killed. In fact, it might
be a good idea to default the list of protected PIDs to be { 1 }.


Things I'd like to add:

- ability to append PIDs. Using the 'echo >>' syntax would be nice, but
/proc
files don't seem to support appending. (is this true?)

- symbolic process names as well as PIDs, maybe process groups too?

- perhaps a more complex interface, where instead of just marking a PID as
absolutely protected, you could specify a 'weight' which factored into the
OOM algorithm. Something like "nice":

-20 : unkillable
-19 to -1: try not to kill
1 to 19: try to kill these first

echo netscape:10 > /proc/sys/vm/oom_protect

...would suggest that "netscape" is a process which is a good candidate
for OOM killing.

I don't think that we should make the OOM heuristic any more complex.
However,
letting the user make suggestions about what should and should not be killed
is a Good Thing.

This is my very first patch, so please be considerate.

Against 2.4.0-test10. Comments and suggestions appreciated!

chris

--- official/linux-2.4.0-test10/mm/oom_kill.c   Mon Nov  6 23:40:52 2000
+++ work/linux-2.4.0-test10/mm/oom_kill.c   Mon Nov  6 23:37:47 2000
@@ -20,9 +20,32 @@
 #include 
 #include 
 #include 
+#include 
+
+#define MAX_OOM_PROTECTS 256
+
+int sysctl_oom_protects[MAX_OOM_PROTECTS];

 /* #define DEBUG */

+int is_oom_protected(int pid)
+{
+   int i;
+   for (i = 0; i < MAX_OOM_PROTECTS; i++) {
+   int ppid = sysctl_oom_protects[i];
+
+   #ifdef DEBUG
+   printk("Protected pid: %d\n",ppid);
+   #endif
+
+   if (ppid == pid)
+   return 1;
+   if (ppid == 0)
+   return 0;
+   }
+   return 0;
+}
+
 /**
  * int_sqrt - oom_kill.c internal function, rough approximation to sqrt
  * @x: integer of which to calculate the sqrt
@@ -124,6 +147,19 @@
read_lock(&tasklist_lock);
for_each_task(p)
{
+   #ifdef DEBUG
+   printk("Testing pid %d\n",p->pid);
+   #endif
+
+   if (is_oom_protected(p->pid))
+

+   #ifdef DEBUG
+   printk("Pid %d is protected\n",p->pid);
+   #endif
+
+   continue;
+   }
+
if (p->pid)
points = badness(p);
if (points > maxpoints) {
--- official/linux-2.4.0-test10/kernel/sysctl.c Mon Nov  6 23:40:52 2000
+++ work/linux-2.4.0-test10/kernel/sysctl.c Mon Nov  6 23:30:08 2000
@@ -85,6 +85,8 @@

 extern int pgt_cache_water[];

+extern int sysctl_oom_protects [];
+
 static int parse_table(int *, int, void *, size_t *, void *, size_t,
   ctl_table *, void **);
 static int proc_doutsstring(ctl_table *table, int write, struct file *filp,
@@ -241,6 +243,10 @@
 &bdflush_min, &bdflush_max},
{VM_OVERCOMMIT_MEMORY, "overcommit_memory", &sysctl_overcommit_memory,
 sizeof(sysctl_overcommit_memory), 0644, NULL, &proc_dointvec},
+
+   {VM_OVERCOMMIT_MEMORY, "oom_protect", &sysctl_oom_protects,
+256, 0644, NULL, &proc_dointvec},
+
{VM_BUFFERMEM, "buffermem",
 &buffer_mem, sizeof(buffer_mem_t), 0644, NULL, &proc_dointvec},
{VM_PAGECACHE, "pagecache",

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



getting a process name from task struct

2000-11-09 Thread Chris Swiedler

Is it possible to get a process's name / full execution path (from
kernelspace) given only a task struct? I can't find any pointers to this
information in the task struct, and I don't know where else it might be. ps
seems to be able to get the process name, but that's from userspace.
Apologies in advance if this is a stupid question.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] oom_nice

2000-11-10 Thread Chris Swiedler

Here's an updated version of the "oom_nice" patch. It allows a sysadmin to
set the "oom niceness" for processes, either by PID or by process name. The
oom niceness value factors into the badness() function called by Rik's
OOM killer. Negative values decrease the chance that the process will be
killed, and positive values increase it.

The usage is:

echo [PID|process_name]=oom_niceness > /proc/sys/vm/oom_nice

examples:

echo 418=-10 > /proc/sys/vm/oom_nice
echo netscape=20 > /proc/sys/vm/oom_nice
echo 1=- > /proc/sys/vm/oom_nice

In the first example, the process with PID 418 is 10 times less likely to
be killed than it would have been. Likewise, in the second example, any
processes named 'netscape' are 20 times more likely to be killed than
otherwise. The last example protects the init process from being killed,
no matter what.

cating oom_nice will show the current nice values for all processes.

By default the oom_nice proc entry is not world-readable or writable. For
security reasons I would suggest that you give good (negative) oom nice
values to processes by PID rather than process name. If any process named
'init' is protected, then it's easy for a user to just rename their
executable
and get around the oom killer.

To test the OOM killer algorithm I also inclued a proc entry
/proc/sys/vm/oom_nice_test. On my machine 'cat /proc/sys/vm/oom_nice_test'
produces:

"OOM killer would have killed process 516 (csh) with 496 points"

Compiling oom_kill.c with DEBUG defined and cating oom_nice_test will print
out the points for all processes, including their oom_nice values and how
they affected the final points.

diff -u -N official/linux-2.4.0/mm/Makefile
work/linux-2.4.0-test10/mm/Makefile
--- official/linux-2.4.0/mm/MakefileMon Nov  6 23:53:01 2000
+++ work/linux-2.4.0-test10/mm/Makefile Tue Nov  7 22:01:00 2000
@@ -10,7 +10,8 @@
 O_TARGET := mm.o
 O_OBJS  := memory.o mmap.o filemap.o mprotect.o mlock.o mremap.o \
vmalloc.o slab.o bootmem.o swap.o vmscan.o page_io.o \
-   page_alloc.o swap_state.o swapfile.o numa.o oom_kill.o
+   page_alloc.o swap_state.o swapfile.o numa.o oom_kill.o \
+   oom_nice.o

 ifeq ($(CONFIG_HIGHMEM),y)
 O_OBJS += highmem.o
--- official/linux-2.4.0/mm/oom_kill.c  Mon Nov  6 23:53:01 2000
+++ work/linux-2.4.0-test10/mm/oom_kill.c   Thu Nov  9 23:12:10 2000
@@ -20,9 +20,12 @@
 #include 
 #include 
 #include 
+#include 

 /* #define DEBUG */

+extern int get_oom_nice(struct task_struct *ts);
+
 /**
  * int_sqrt - oom_kill.c internal function, rough approximation to sqrt
  * @x: integer of which to calculate the sqrt
@@ -55,9 +58,9 @@
  *of least surprise ... (be careful when you change it)
  */

-static int badness(struct task_struct *p)
+int badness(struct task_struct *p)
 {
-   int points, cpu_time, run_time;
+   int points, cpu_time, run_time, oom_nice;

if (!p->mm)
return 0;
@@ -101,6 +104,22 @@
 */
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO))
points /= 4;
+
+   oom_nice = get_oom_nice(p);
+#ifdef DEBUG
+   if (oom_nice != 0)
+   printk(KERN_DEBUG "OOMkill: task %d (%s) has oom_nice=%d. start points:
%d\n",
+   p->pid,p->comm,oom_nice,points);
+#endif
+
+   if (oom_nice == INT_MIN)
+   points = 0;
+   else if (oom_nice > 0)
+   points *= oom_nice;
+   else if (oom_nice < 0)
+   points /= -oom_nice;
+
+
 #ifdef DEBUG
printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
p->pid, p->comm, points);
@@ -124,11 +143,12 @@
read_lock(&tasklist_lock);
for_each_task(p)
{
-   if (p->pid)
+   if (p->pid) {
points = badness(p);
-   if (points > maxpoints) {
-   chosen = p;
-   maxpoints = points;
+   if (points > maxpoints) {
+   chosen = p;
+   maxpoints = points;
+   }
}
}
read_unlock(&tasklist_lock);
@@ -156,7 +176,7 @@
if (p == NULL)
panic("Out of memory and no killable processes...\n");

-   printk(KERN_ERR "Out of Memory: Killed process %d (%s).", p->pid,
p->comm);
+   printk(KERN_ERR "Out of Memory: Killed process %d (%s).\n", p->pid,
p->comm);

/*
 * We give our sacrificial lamb high priority and access to
diff -u -N official/linux-2.4.0/mm/oom_nice.c
work/linux-2.4.0-test10/mm/oom_nice.c
--- official/linux-2.4.0/mm/oom_nice.c  Wed Dec 31 19:00:00 1969
+++ work/linux-2.4.0-test10/mm/oom_nice.c   Thu Nov  9 23:19:45 2000
@@ -0,0 +1,250 @@
+/*
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef CONFIG_PROC_FS
+#error You really need /proc support for oom_

RE: asm-i386/uaccess.h changes: bug or feature?

2000-10-04 Thread Chris Swiedler

To clarify: you're getting missing-symbol errors (not duplicate-symbols)?

I believe that the "return" versions of these macros have been deprecated.
There's an effort going on to replace these functions with a standard
"put_user(); return;" pair. People think that having a macro which returns
from a function is a bad idea. I imagine the source you're using hasn't been
updated; I would suggest removing the xxx_ret macros from the package you're
compiling (or contacting its maintainer).

chris

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Wes McRae
> Sent: Wednesday, October 04, 2000 3:00 PM
> To: [EMAIL PROTECTED]
> Subject: asm-i386/uaccess.h changes: bug or feature?
>
>
> Hello
>
> Background: compiling lm_sensors 2.5.2 on RedHat 7.0 running 2.4.0-test8
> kernel.  (vanilla intel system)
>
> Attempting to compile the sensor package gave symbol errors regarding
> the xxx_ret symbols (copying to/from user space, putting/getting.  These
> have been removed from uaccess.h in recent kernels.  These were present
> in 2.2.16 and at least some 2.3.x kernels.  As  you can see from the
> appended diff, these appear to be the only changes.  I tend to assume
> there's a reason, but could find no explanation in the kernel docs for
> it.
>
> For what it's worth, I could still compile the application--just not
> load its modules.
>
> Many apologies if this is not the right place to send this--it seemed
> the most likely place after checking MAINTAINERS and REPORTING-BUGS.
>
> bye
> wes
>
> --- /usr/include/asm/uaccess.h  Fri Aug 25 08:31:57 2000
> +++ /usr/src/linux/include/asm/uaccess.hFri Sep 29 12:09:04 2000
>
> @@ -232,20 +232,6 @@
> : "=r"(err), ltype (x)  \
> : "m"(__m(addr)), "i"(-EFAULT), "0"(err))
>
> -/*
> - * The "xxx_ret" versions return constant specified in third argument,
> if
> - * something bad happens. These macros can be optimized for the
> - * case of just returning from the function xxx_ret is used.
> - */
> -
> -#define put_user_ret(x,ptr,ret) ({ if (put_user(x,ptr)) return ret; })
> -
> -#define get_user_ret(x,ptr,ret) ({ if (get_user(x,ptr)) return ret; })
> -
> -#define __put_user_ret(x,ptr,ret) ({ if (__put_user(x,ptr)) return ret;
> })
> -
> -#define __get_user_ret(x,ptr,ret) ({ if (__get_user(x,ptr)) return ret;
> })
> -
>
>  /*
>   * Copy To/From Userspace
> @@ -582,10 +568,6 @@
> (__builtin_constant_p(n) ?  \
>  __constant_copy_from_user((to),(from),(n)) :   \
>  __generic_copy_from_user((to),(from),(n)))
> -
> -#define copy_to_user_ret(to,from,n,retval) ({ if
> (copy_to_user(to,from,n)) return retval; })
> -
> -#define copy_from_user_ret(to,from,n,retval) ({ if
> (copy_from_user(to,from,n)) return retval; })
>
>  #define __copy_to_user(to,from,n)  \
> (__builtin_constant_p(n) ?  \
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: __bad_udelay in 2.2.18pre15

2000-10-11 Thread Chris Swiedler

> > 2.2.18pre15 defines udelay as (in file include/asm-i386/delay.h) :
> > extern void __bad_udelay(void);
> >
> > #define udelay(n) (__builtin_constant_p(n) ? \
> > ((n) > 2 ? __bad_udelay() : __const_udelay((n) *
> > 0x10c6ul)) : \
> > __udelay(n))
> >
> > ...
> > It seems __bad_udelay is not defined anywhere in the kernel source.
>
> Correct. Its a compile time error trap

Wouldn't it be better to use an #error directive? I'm sure this could turn
into a FAQ, even though the symbol is called "__bad_udelay()".

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Updated 2.4 TODO List

2000-10-12 Thread Chris Swiedler

> On Wed, 11 Oct 2000 18:10:40 -0400,
> [EMAIL PROTECTED] wrote:
> >Are you sure it was compiled with the correct CPU?  If you configure the
> >CPU incorrectly (686 when you only have a 586, etc.) the kernel *will*
> >refuse to boot.
> >
> >Maybe we should have the kernel print the CPU information it was
> >compiled with before it does anything else.  It'll make it easier to
> >catch what may be a fairly common set of PEBCAK case
>
> Unfortunately any code like this
>   if (a)
>   b = 99;
> generates conditional move (cmove) instructions on 686.  In vsprintf.c
> there are several of these constructs, in particular strnlen generates
> it.  So printk("%s", text) tends to fault as well.  Some people have
> argued that critical routines should always be compiled with -i386,
> unfortunately that includes all of printk and all console handling
> (both serial and screen), not really an option.
>
> If anything is going to detect the mismatch and complain, it has to be
> the boot loader, after uncompressing and before entering the kernel
> proper.

But the kernel should be able to write directly to the screen, even if it's
extremely minimal information. Something like how LILO does it: test the
common hang-on-boot conditions (like wrong CPU type) and print a single
character after each test.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-12 Thread Chris Swiedler

> > Am I reading this correctly--the address of the main() function for a
> > process is guaranteed to be the lowest possible virtual address?
> >
> > chris
> >
>
> It is one of the lowest. The 'C' runtime library puts section
> .text (the code) first, then .data, then .bss, then .stack.  The
> .stack section is co-located with the heap which can be extended
> by setting a new break address.
>
> When a process is created, the lowest address is the entry point of
> crt0.o  _init. We can see where that is by:
>
> Script started on Thu Oct 12 14:25:35 2000
> # cat xxx.c
>
> extern int _init();
> main()
> {
> printf("_init is at %p\n", _init);
> }
>
> # gcc -o xxx xxx.c
> # ./xxx
> _init is at 0x804838c
> # exit
> exit
> Script done on Thu Oct 12 14:25:51 2000
>
> That said, remember that in Unix, the 'C' rutime library exists in the
> lower portion of the .text section. So your code's virtual address space
> starts above that address space. This is MMAPed so everybody gets
> to share the same pages. In this way, you don't all have to keep a
> private copy of the 'C' runtime library.

User-process virtual addresses have no direct relation to physical
addresses, right? So why does the process space start at such a high virtual
address (why not closer to 0x)? Seems we're wasting ~128 megs of
RAM. Not a huge amount compared to 4G, but signifigant. Is that space used
(libc can't be that big!) or reserved somehow?

Another question: how (and where in the code) do we translate virtual
user-addresses to physical addresses? Does the MMU do it, or does it call a
kernel handler function? Why is the kernel allowed to reference physical
addresses, while user processes go through the translation step? Can kernel
pages be swapped out / faulted in just like user process pages?

Sorry to pounce on you with all of these questions. I've read up on this
stuff but can't always find answers...

thanks--
chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



why is modprobe (and nothing else) exec()'d?

2000-10-13 Thread Chris Swiedler

Why is modprobe kept as a separate executable, when nothing else in the
kernel is (seems to be)? What is the advantage to keeping modprobe separate,
instead of statically linked into the kernel? Are users able to replace
modprobe with a better version? If so, why not do the same thing with other
occasionally-used  code which could be replaced? Something like Rik's OOM
killer comes to mind, except that obviously if you're out of memory you're
not going to be able to load a new executable.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: why is modprobe (and nothing else) exec()'d?

2000-10-13 Thread Chris Swiedler

Ok, I should have thought of that ;-). I've never used modprobe directly
myself, and had forgotten that was possible. Thanks to everyone who replied.

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: large memory support for x86

2000-10-13 Thread Chris Swiedler

> no, x86 virtual memory is 32 bits - segmentation only provides a way to
> segment this 4GB virtual memory, but cannot extend it. Under Linux there
> is 3GB virtual memory available to user-space processes.
>
> this 3GB virtual memory does not have to be mapped to the same physical
> pages all the time - and this is nothing new. mmap()/munmap()-ing memory
> dynamically is one way to 'extend' the amount of physical RAM controlled
> by a single process. I doubt this would be very economical though.
>
> Such big-RAM systems are almost always SMP systems, so eg. a 4-way system
> can have 4x 3GB processes == 12 GB RAM fully utilized. An 8-way system can
> utilize up to 24 GB RAM at once, without having to play mmap/munmap
> 'memory extender' tricks.

Why is it that a user process can't intentionally switch segments?
Dereferencing a 32-bit address causes the address to be calculated using the
"current" segment descriptor, right? It seems to me that a process could set
a new segment selector, in which case a dereference would operate on a whole
new segment. Is there a reason why processes are limited to a single
segment?

chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



quick questions: kernel stack size and call gates

2000-10-17 Thread Chris Swiedler

1.  Does Linux use call gates (as specified in the Intel SDK vol.3) when a
user process makes a system call? From what I understand, call-gates let a
ring-3 process execute ring-0 code, which sounds exactly like a system call.
I've found all of the actual system call functions (sys_ni etc.) in sys.c,
but where is the code which userland calls to transfer to "kernel mode" and
execute a particular syscall?

2.  I've often heard that the kernel stack size is set small (4 or 8k?). Is
this done by limiting the size of the stack segment itself for the kernel?
Where is the code which sets up the limit?

tia
chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/