Re: rw_semaphores

2001-04-16 Thread yodaiken

On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
> >
> > I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
> > major failure and I can't. To me: the result of an attempt by the 32,768th locker
> > should be a kernel panic. Is there a reasonable scenario where this is wrong?
> 
> Hint: "I'm trying to imagine a case when writing all zeroes to /etc/passwd
> is anything but a major failure, but I can't. So why don't we make
> /etc/passwd world-writable?"
> 
> Right. Security.

The analogy is too subtle for  me,
but my question was not whether the correct error
response should be to panic, but whether there was a good reason for allowing
such a huge number of users of a lock.

> There is _never_ any excuse for panic'ing because of some inherent
> limitation of the data structures. You can return -ENOMEM, -EAGAIN or
> somehting like that, but you must _not_ allow a panic (or a roll-over,
> which would just result in corrupted kernel data structures).

There's a difference between a completely reasonable situation in which 
all of some resource has been committed
 and a situation which in itself indicates some sort of fundamental error. 
If  32K+ users of a lock is an  errror, then returning -ENOMEM may be
inadequate.

> 
> Note that the limit is probably really easy to work around even without
> extending the number of bits: a sleeper that notices that the count is
> even _halfway_ to rolling around could easily do something like:
> 
>  - undo "this process" action
>  - sleep for 1 second
>  - try again from the beginning.
> 
> I certainly agree that no _reasonable_ pattern can cause the failure, but
> we need to worry about people who are malicious. The above trivial
> approach would take care of that, while not penalizing any non-malicious
> users.

Ok. I'm  too nice a guy to think about malicious users so I simply considered
the kernel error  case.
You probably want a diagnostic so people who get mysterious slowdowns can
report:
/var/log/messages included the message "Too many users on lock 0x..."


> 
> So I'm not worried about this at all. I just want people _always_ to think
> about "how could I mis-use this if I was _truly_ evil", and making sure it
> doesn't cause problems for others on the system.
> 
> (NOTE: This does not mean that the kernel has to do anything _reasonable_
> under all circumstances. There are cases where Linux has decided that
> "this is not something a reasonable program can do, and if you try to do
> it, we'll give you random results back - but they will not be _security_
> holes". We don't need to be _nice_ to unreasonable requests. We just must
> never panic, otherwise crash or allow unreasonable requests to mess up
> _other_ people)
> 
>   Linus

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: rw_semaphores

2001-04-16 Thread yodaiken

On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
> 
> Since you're willing to use CMPXCHG in your suggested implementation, would it
> make it make life easier if you were willing to use XADD too?
> 
> Plus, are you really willing to limit the number of readers or writers to be
> 32767? If so, I think I can suggest a way that limits it to ~65535 apiece
> instead...

I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a 
major failure and I can't. To me: the result of an attempt by the 32,768th locker
should be a kernel panic. Is there a reasonable scenario where this is wrong?


> 
> David
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: rw_semaphores

2001-04-16 Thread yodaiken

On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
 
 Since you're willing to use CMPXCHG in your suggested implementation, would it
 make it make life easier if you were willing to use XADD too?
 
 Plus, are you really willing to limit the number of readers or writers to be
 32767? If so, I think I can suggest a way that limits it to ~65535 apiece
 instead...

I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a 
major failure and I can't. To me: the result of an attempt by the 32,768th locker
should be a kernel panic. Is there a reasonable scenario where this is wrong?


 
 David
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: rw_semaphores

2001-04-16 Thread yodaiken

On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote:
 
 
 On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
 
  I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
  major failure and I can't. To me: the result of an attempt by the 32,768th locker
  should be a kernel panic. Is there a reasonable scenario where this is wrong?
 
 Hint: "I'm trying to imagine a case when writing all zeroes to /etc/passwd
 is anything but a major failure, but I can't. So why don't we make
 /etc/passwd world-writable?"
 
 Right. Security.

The analogy is too subtle for  me,
but my question was not whether the correct error
response should be to panic, but whether there was a good reason for allowing
such a huge number of users of a lock.

 There is _never_ any excuse for panic'ing because of some inherent
 limitation of the data structures. You can return -ENOMEM, -EAGAIN or
 somehting like that, but you must _not_ allow a panic (or a roll-over,
 which would just result in corrupted kernel data structures).

There's a difference between a completely reasonable situation in which 
all of some resource has been committed
 and a situation which in itself indicates some sort of fundamental error. 
If  32K+ users of a lock is an  errror, then returning -ENOMEM may be
inadequate.

 
 Note that the limit is probably really easy to work around even without
 extending the number of bits: a sleeper that notices that the count is
 even _halfway_ to rolling around could easily do something like:
 
  - undo "this process" action
  - sleep for 1 second
  - try again from the beginning.
 
 I certainly agree that no _reasonable_ pattern can cause the failure, but
 we need to worry about people who are malicious. The above trivial
 approach would take care of that, while not penalizing any non-malicious
 users.

Ok. I'm  too nice a guy to think about malicious users so I simply considered
the kernel error  case.
You probably want a diagnostic so people who get mysterious slowdowns can
report:
/var/log/messages included the message "Too many users on lock 0x..."


 
 So I'm not worried about this at all. I just want people _always_ to think
 about "how could I mis-use this if I was _truly_ evil", and making sure it
 doesn't cause problems for others on the system.
 
 (NOTE: This does not mean that the kernel has to do anything _reasonable_
 under all circumstances. There are cases where Linux has decided that
 "this is not something a reasonable program can do, and if you try to do
 it, we'll give you random results back - but they will not be _security_
 holes". We don't need to be _nice_ to unreasonable requests. We just must
 never panic, otherwise crash or allow unreasonable requests to mess up
 _other_ people)
 
   Linus

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: POSIX 52 53? 54

2001-04-13 Thread yodaiken

On Fri, Apr 13, 2001 at 08:46:14AM -0400, Mark Salisbury wrote:
> these are covered in IEEE P100.13, D9 September 1997 AD212.  it is available
> from IEEE for a "nominal" fee.
> 
> they are 4 defined subsets of POSIX that are deemed appropriate for real-time
> systems.
> 
> unfortunately, the sub in subset is a small delta from the full set.
> 
> the subsets are:
>   PSE51: Minimal Realtime System Profile
>   PSE52: Realtime Controller System Profile
>   PSE53: Dedicated Realtime System Profile
>   PSE54: Multi-purpose Realtime System Profile.
> 
> now, PSE51 is already about 90% of POSIX, so I don't really see what is so
> minimal about it.  the others require even more.

Actually PSE51 seems to me to be pretty smart which is why, even though I
swore we would never adopt the bloated, slow, POSIX standard for RTLinux,
the discovery of 1003.13 changed my mind. PSE51 says a minimal RT system
can look like a single process with "main" as the OS and with signal 
handlers and threads for applications. They note that POSIX does require
"open/close/read/write", but in PSE51, we don't need to offer POSIX file
semantics: this is actually pretty nice for us. We install interrupt
handlers with sigaction, use the thread/creation deletion and the standard
synchronization API which I have grown to semi-like. 1003.13 gets around
"fork" and such by simply adopting a "single process" semantics. According
to the standard, we gotta have "fork", but it can fail due to too many
processes (Linus hates this, but he thinks CC-NUMA scales, so ...)
Basically, PSE51 allows for a real standard API without requiring the system
to stop being hard realtime.
 And then  we have one thread as  PS54 system so we can do
pthread_kill(linux_thread(), LINUX_IRQ +n)
to  send interrupt "n" to Linux
And Alan Cox invented a brilliant method for fault tolerance where the PSE51
system runs a watchdog thread for the OS and has a recovery thread that
does
   vulture:
/* wait for death*/
pthread_join(linux_thread())
/* note that critical RT processing continues */
generate_blue_screen("NT EMULATION MODE ON: PLEASE BE PATIENT WHILE 
  WE RECOVER. .\n");
pthread_create(linux_thread(),_attr,restart_linux,0);
goto vulture;

This could be implemented quite easily.

> 
> On Thu, 12 Apr 2001, [EMAIL PROTECTED] wrote:
> > POSIX 1003.13 defines profiles 51-4 where 51 is a single POSIX
> > process with multiple threads (RTLinux) and 54 is a full POSIX OS
> > with the RT extensions (Linux).
> > 
> > On Thu, Apr 12, 2001 at 08:15:34PM -0700, george anzinger wrote:
> > > Any one know any thing about a POSIX draft 52 or 53 or 54.  I think they
> > > are suppose to have something to do with real time.
> > > 
> > > Where can they be found?  What do they imply for the kernel?
> > > 
> > > George
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to [EMAIL PROTECTED]
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> > -- 
> > -
> > Victor Yodaiken 
> > Finite State Machine Labs: The RTLinux Company.
> >  www.fsmlabs.com  www.rtlinux.com
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -- 
> /***
> **   Mark Salisbury | Mercury Computer Systems**
> **   [EMAIL PROTECTED] | System OS - Kernel Team **
> ****
> **  I will be riding in the Multiple Sclerosis**
> **  Great Mass Getaway, a 150 mile bike ride from **
> **  Boston to Provincetown.  Last year I raised   **
> **  over $1200.  This year I would like to beat   **
> **  that.  If you would like to contribute,   **
> **  please contact me.**
> ***/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: POSIX 52 53? 54

2001-04-13 Thread yodaiken

On Fri, Apr 13, 2001 at 08:46:14AM -0400, Mark Salisbury wrote:
 these are covered in IEEE P100.13, D9 September 1997 AD212.  it is available
 from IEEE for a "nominal" fee.
 
 they are 4 defined subsets of POSIX that are deemed appropriate for real-time
 systems.
 
 unfortunately, the sub in subset is a small delta from the full set.
 
 the subsets are:
   PSE51: Minimal Realtime System Profile
   PSE52: Realtime Controller System Profile
   PSE53: Dedicated Realtime System Profile
   PSE54: Multi-purpose Realtime System Profile.
 
 now, PSE51 is already about 90% of POSIX, so I don't really see what is so
 minimal about it.  the others require even more.

Actually PSE51 seems to me to be pretty smart which is why, even though I
swore we would never adopt the bloated, slow, POSIX standard for RTLinux,
the discovery of 1003.13 changed my mind. PSE51 says a minimal RT system
can look like a single process with "main" as the OS and with signal 
handlers and threads for applications. They note that POSIX does require
"open/close/read/write", but in PSE51, we don't need to offer POSIX file
semantics: this is actually pretty nice for us. We install interrupt
handlers with sigaction, use the thread/creation deletion and the standard
synchronization API which I have grown to semi-like. 1003.13 gets around
"fork" and such by simply adopting a "single process" semantics. According
to the standard, we gotta have "fork", but it can fail due to too many
processes (Linus hates this, but he thinks CC-NUMA scales, so ...)
Basically, PSE51 allows for a real standard API without requiring the system
to stop being hard realtime.
 And then  we have one thread as  PS54 system so we can do
pthread_kill(linux_thread(), LINUX_IRQ +n)
to  send interrupt "n" to Linux
And Alan Cox invented a brilliant method for fault tolerance where the PSE51
system runs a watchdog thread for the OS and has a recovery thread that
does
   vulture:
/* wait for death*/
pthread_join(linux_thread())
/* note that critical RT processing continues */
generate_blue_screen("NT EMULATION MODE ON: PLEASE BE PATIENT WHILE 
  WE RECOVER. .\n");
pthread_create(linux_thread(),linux_attr,restart_linux,0);
goto vulture;

This could be implemented quite easily.

 
 On Thu, 12 Apr 2001, [EMAIL PROTECTED] wrote:
  POSIX 1003.13 defines profiles 51-4 where 51 is a single POSIX
  process with multiple threads (RTLinux) and 54 is a full POSIX OS
  with the RT extensions (Linux).
  
  On Thu, Apr 12, 2001 at 08:15:34PM -0700, george anzinger wrote:
   Any one know any thing about a POSIX draft 52 or 53 or 54.  I think they
   are suppose to have something to do with real time.
   
   Where can they be found?  What do they imply for the kernel?
   
   George
   -
   To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
   the body of a message to [EMAIL PROTECTED]
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
   Please read the FAQ at  http://www.tux.org/lkml/
  
  -- 
  -
  Victor Yodaiken 
  Finite State Machine Labs: The RTLinux Company.
   www.fsmlabs.com  www.rtlinux.com
  
  -
  To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  Please read the FAQ at  http://www.tux.org/lkml/
 -- 
 /***
 **   Mark Salisbury | Mercury Computer Systems**
 **   [EMAIL PROTECTED] | System OS - Kernel Team **
 ****
 **  I will be riding in the Multiple Sclerosis**
 **  Great Mass Getaway, a 150 mile bike ride from **
 **  Boston to Provincetown.  Last year I raised   **
 **  over $1200.  This year I would like to beat   **
 **  that.  If you would like to contribute,   **
 **  please contact me.**
 **----*/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: POSIX 52 53? 54

2001-04-12 Thread yodaiken


POSIX 1003.13 defines profiles 51-4 where 51 is a single POSIX
process with multiple threads (RTLinux) and 54 is a full POSIX OS
with the RT extensions (Linux).

On Thu, Apr 12, 2001 at 08:15:34PM -0700, george anzinger wrote:
> Any one know any thing about a POSIX draft 52 or 53 or 54.  I think they
> are suppose to have something to do with real time.
> 
> Where can they be found?  What do they imply for the kernel?
> 
> George
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: POSIX 52 53? 54

2001-04-12 Thread yodaiken


POSIX 1003.13 defines profiles 51-4 where 51 is a single POSIX
process with multiple threads (RTLinux) and 54 is a full POSIX OS
with the RT extensions (Linux).

On Thu, Apr 12, 2001 at 08:15:34PM -0700, george anzinger wrote:
 Any one know any thing about a POSIX draft 52 or 53 or 54.  I think they
 are suppose to have something to do with real time.
 
 Where can they be found?  What do they imply for the kernel?
 
 George
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: [PATCH for 2.5] preemptible kernel

2001-04-10 Thread yodaiken

On Tue, Apr 10, 2001 at 09:08:16PM -0700, Paul McKenney wrote:
> > Disabling preemption is a possible solution if the critical section is
> short
> > - less than 100us - otherwise preemption latencies become a problem.
> 
> Seems like a reasonable restriction.  Of course, this same limit applies
> to locks and interrupt disabling, right?

So supposing 1/2 us per update
lock process list
for every process update pgd
unlock process list

is ok if #processes <  200, but can cause some unspecified system failure
due to a dependency on the 100us limit otherwise?

And on a slower machine or with some heavy I/O possibilities 

We have a tiny little kernel to worry about inRTLinux and it's quite 
hard for us to keep track of all possible delays in such cases. How's this
going to work for Linux?


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread yodaiken

On Tue, Apr 10, 2001 at 04:43:36AM -0700, David Schleef wrote:
> However, on machines without a monotonically increasing counter,
> i.e., the TSC, you have to use 8254 timer 0 as both the timebase
> and the interval counter -- you end up slowly losing time because
> of the race condition between reading the timer and writing a
> new interval.  RTAI's solution is to disable kd_mksound and
> use timer 2 as a poor man's TSC.  If either of those is too big
> of a price, it may suffice to report that the timer granularity
> on 486's is 10 ms.

Just for the record, Michael Barabanov did this in RTLinux from before
kd_mksound was a function pointer in 1995. Michael had an optimization
attempt using channel 1 for a while, but on very slow machines this 
was not sufficient and he went back to channel 2. Of course, the 
fundamental problem is that board designers keep putting an 1920s
part in machines built in 2001. 

Here's the comment from the RTLinux 0.5 patch -- all available on the archives
on rtlinux.com.

+/* The main procedure; resets the 8254 timer to generate an interrupt.  The
+ * tricky part is to keep the global time while reprogramming it.  We latch
+ * counters 0 and 2 atomically before and after reprogramming to figure it out.
+ */


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: No 100 HZ timer !

2001-04-10 Thread yodaiken

On Tue, Apr 10, 2001 at 04:43:36AM -0700, David Schleef wrote:
 However, on machines without a monotonically increasing counter,
 i.e., the TSC, you have to use 8254 timer 0 as both the timebase
 and the interval counter -- you end up slowly losing time because
 of the race condition between reading the timer and writing a
 new interval.  RTAI's solution is to disable kd_mksound and
 use timer 2 as a poor man's TSC.  If either of those is too big
 of a price, it may suffice to report that the timer granularity
 on 486's is 10 ms.

Just for the record, Michael Barabanov did this in RTLinux from before
kd_mksound was a function pointer in 1995. Michael had an optimization
attempt using channel 1 for a while, but on very slow machines this 
was not sufficient and he went back to channel 2. Of course, the 
fundamental problem is that board designers keep putting an 1920s
part in machines built in 2001. 

Here's the comment from the RTLinux 0.5 patch -- all available on the archives
on rtlinux.com.

+/* The main procedure; resets the 8254 timer to generate an interrupt.  The
+ * tricky part is to keep the global time while reprogramming it.  We latch
+ * counters 0 and 2 atomically before and after reprogramming to figure it out.
+ */


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: [PATCH for 2.5] preemptible kernel

2001-04-10 Thread yodaiken

On Tue, Apr 10, 2001 at 09:08:16PM -0700, Paul McKenney wrote:
  Disabling preemption is a possible solution if the critical section is
 short
  - less than 100us - otherwise preemption latencies become a problem.
 
 Seems like a reasonable restriction.  Of course, this same limit applies
 to locks and interrupt disabling, right?

So supposing 1/2 us per update
lock process list
for every process update pgd
unlock process list

is ok if #processes   200, but can cause some unspecified system failure
due to a dependency on the 100us limit otherwise?

And on a slower machine or with some heavy I/O possibilities 

We have a tiny little kernel to worry about inRTLinux and it's quite 
hard for us to keep track of all possible delays in such cases. How's this
going to work for Linux?


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread yodaiken

On Thu, Feb 01, 2001 at 04:32:48PM -0200, Rik van Riel wrote:
> On Thu, 1 Feb 2001, Alan Cox wrote:
> 
> > > Sure.  But Linus saing that he doesn't want more of that (shit, crap,
> > > I don't rember what he said exactly) in the kernel is a very good reason
> > > for thinking a little more aboyt it.
> > 
> > No. Linus is not a God, Linus is fallible, regularly makes mistakes and
> > frequently opens his mouth and says stupid things when he is far too busy.
> 
> People may remember Linus saying a resolute no to SMP
> support in Linux ;)

And perhaps he was right!

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread yodaiken

On Thu, Feb 01, 2001 at 04:32:48PM -0200, Rik van Riel wrote:
 On Thu, 1 Feb 2001, Alan Cox wrote:
 
   Sure.  But Linus saing that he doesn't want more of that (shit, crap,
   I don't rember what he said exactly) in the kernel is a very good reason
   for thinking a little more aboyt it.
  
  No. Linus is not a God, Linus is fallible, regularly makes mistakes and
  frequently opens his mouth and says stupid things when he is far too busy.
 
 People may remember Linus saying a resolute no to SMP
 support in Linux ;)

And perhaps he was right!

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-28 Thread yodaiken

On Sun, Jan 21, 2001 at 06:21:05PM -0800, Nigel Gamble wrote:
> Yes, I most emphatically do disagree with Victor!  IRIX is used for
> mission-critical audio applications - recording as well playback - and
> other low-latency applications.  The same OS scales to large numbers of
> CPUs.  And it has the best desktop interactive response of any OS I've

And it has bloat, it's famously buggy, it is impossible to maintain, ...


> used.  I will be very happy when Linux is as good in all these areas,
> and I'm working hard to achieve this goal with negligible impact on the
> current Linux "sweet-spot" applications such as web serving.

As stated previously: I think this is a proven improbability and I have
not seen any code or designs from you to show otherwise.

> I agree.  I'm not wedded to any particular design - I just want a
> low-latency Linux by whatever is the best way of achieving that.
> However, I am hearing Victor say that we shouldn't try to make Linux
> itself low-latency, we should just use his so-called "RTLinux" environment

I suggest that you get your hearing checked. I'm fully in favor of sensible
low latency Linux. I believe however that low latency  in Linux will
A. be "soft realtime", close to deadline most of the time.
B. millisecond level on present hardware
C. Best implemented by careful algorithm design instead of 
"stuff the kernel with resched points" and hope for the best.

RTLinux main focus is hard realtime: a few microseconds here and there
are critical for us and for the applications we target. For consumer
audio, this is overkill and vanilla Linux should be able to provide
services reasonably well. But ...

> for low-latency tasks.  RTLinux is not Linux, it is a separate
> environment with a separate, limited set of APIs.  You can't run XMMS,
> or any other existing Linux audio app in RTLinux.  I want a low-latency
> Linux, not just another RTOS living parasitically alongside Linux.

Nice marketing line, but it is not working code.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-28 Thread yodaiken

On Sun, Jan 21, 2001 at 06:21:05PM -0800, Nigel Gamble wrote:
 Yes, I most emphatically do disagree with Victor!  IRIX is used for
 mission-critical audio applications - recording as well playback - and
 other low-latency applications.  The same OS scales to large numbers of
 CPUs.  And it has the best desktop interactive response of any OS I've

And it has bloat, it's famously buggy, it is impossible to maintain, ...


 used.  I will be very happy when Linux is as good in all these areas,
 and I'm working hard to achieve this goal with negligible impact on the
 current Linux "sweet-spot" applications such as web serving.

As stated previously: I think this is a proven improbability and I have
not seen any code or designs from you to show otherwise.

 I agree.  I'm not wedded to any particular design - I just want a
 low-latency Linux by whatever is the best way of achieving that.
 However, I am hearing Victor say that we shouldn't try to make Linux
 itself low-latency, we should just use his so-called "RTLinux" environment

I suggest that you get your hearing checked. I'm fully in favor of sensible
low latency Linux. I believe however that low latency  in Linux will
A. be "soft realtime", close to deadline most of the time.
B. millisecond level on present hardware
C. Best implemented by careful algorithm design instead of 
"stuff the kernel with resched points" and hope for the best.

RTLinux main focus is hard realtime: a few microseconds here and there
are critical for us and for the applications we target. For consumer
audio, this is overkill and vanilla Linux should be able to provide
services reasonably well. But ...

 for low-latency tasks.  RTLinux is not Linux, it is a separate
 environment with a separate, limited set of APIs.  You can't run XMMS,
 or any other existing Linux audio app in RTLinux.  I want a low-latency
 Linux, not just another RTOS living parasitically alongside Linux.

Nice marketing line, but it is not working code.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-20 Thread yodaiken

On Fri, Jan 12, 2001 at 07:45:43PM -0700, Jay Ts wrote:
> Andrew Morton wrote:
> > 
> > Jay Ts wrote:
> > > 
> > > Now about the only thing left is to get it included
> > > in the standard kernel.  Do you think Linus Torvalds is more likely
> > > to accept these patches than Ingo's?  I sure hope this one works out.
> > 
> > We (or "he") need to decide up-front that Linux is to become
> > a low latency kernel. Then we need to decide the best way of
> > doing that.
> > 
> > Making the kernel internally preemptive is probably the best way of
> > doing this.  But it's a *big* task
> 
> Ouch.  Yes, I agree that the ideal path is for Linus and the other
> kernel developers and ... well, just about everyone ... is to create
> a long-range strategy and 'roadmap' that includes support for low-latency.
> 
> And making the kernel preemptive might be the best way to do that
> (and I'm saying "might"...).

Keep in mind that Ken Thompson & Dennis Ritchie did not decide on a 
non-preemptive strategy for UNIX because they were unaware of such 
methods or because they were stupid. And when Rob Pike redesigned a new
"unix" Plan9  note there is no-preemptive kernel, and the core Linux
designers have rejected preemptive kernels too. Now it is certainly possible
that things have change and/or all these folks are just plain wrong. But
I wouldn't bet too much on it.

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-20 Thread yodaiken


Let me just point out that Nigel (I think) has previously stated that
the purpose of this approach is to bring the stunning success of 
IRIX style "RT" to Linux. Since some of us believe that IRIX is a virtual
handbook of OS errors, it really comes down to a design style. I think
that simplicity and "does the main job well" wins every time over 
"really cool algorithms" and "does everything badly". Others 
disagree.


On Sat, Jan 13, 2001 at 12:30:46AM +1100, Andrew Morton wrote:
> Nigel Gamble wrote:
> > 
> > Spinlocks should not be held for lots of time.  This adversely affects
> > SMP scalability as well as latency.  That's why MontaVista's kernel
> > preemption patch uses sleeping mutex locks instead of spinlocks for the
> > long held locks.
> 
> Nigel,
> 
> what worries me about this is the Apache-flock-serialisation saga.
> 
> Back in -test8, kumon@fujitsu demonstrated that changing this:
> 
>   lock_kernel()
>   down(sem)
>   
>   up(sem)
>   unlock_kernel()
> 
> into this:
> 
>   down(sem)
>   
>   up(sem)
> 
> had the effect of *decreasing* Apache's maximum connection rate
> on an 8-way from ~5,000 connections/sec to ~2,000 conn/sec.
> 
> That's downright scary.
> 
> Obviously,  was very quick, and the CPUs were passing through
> this section at a great rate.
> 
> How can we be sure that converting spinlocks to semaphores
> won't do the same thing?  Perhaps for workloads which we
> aren't testing?
> 
> So this needs to be done with caution.
> 
> As davem points out, now we know where the problems are
> occurring, a good next step is to redesign some of those
> parts of the VM and buffercache.  I don't think this will
> be too hard, but they have to *want* to change :)
> 
> Some of those algorithms are approximately O(N^2), for huge
> values of N.
> 
> 
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-20 Thread yodaiken


Let me just point out that Nigel (I think) has previously stated that
the purpose of this approach is to bring the stunning success of 
IRIX style "RT" to Linux. Since some of us believe that IRIX is a virtual
handbook of OS errors, it really comes down to a design style. I think
that simplicity and "does the main job well" wins every time over 
"really cool algorithms" and "does everything badly". Others 
disagree.


On Sat, Jan 13, 2001 at 12:30:46AM +1100, Andrew Morton wrote:
 Nigel Gamble wrote:
  
  Spinlocks should not be held for lots of time.  This adversely affects
  SMP scalability as well as latency.  That's why MontaVista's kernel
  preemption patch uses sleeping mutex locks instead of spinlocks for the
  long held locks.
 
 Nigel,
 
 what worries me about this is the Apache-flock-serialisation saga.
 
 Back in -test8, kumon@fujitsu demonstrated that changing this:
 
   lock_kernel()
   down(sem)
   stuff
   up(sem)
   unlock_kernel()
 
 into this:
 
   down(sem)
   stuff
   up(sem)
 
 had the effect of *decreasing* Apache's maximum connection rate
 on an 8-way from ~5,000 connections/sec to ~2,000 conn/sec.
 
 That's downright scary.
 
 Obviously, stuff was very quick, and the CPUs were passing through
 this section at a great rate.
 
 How can we be sure that converting spinlocks to semaphores
 won't do the same thing?  Perhaps for workloads which we
 aren't testing?
 
 So this needs to be done with caution.
 
 As davem points out, now we know where the problems are
 occurring, a good next step is to redesign some of those
 parts of the VM and buffercache.  I don't think this will
 be too hard, but they have to *want* to change :)
 
 Some of those algorithms are approximately O(N^2), for huge
 values of N.
 
 
 -
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [linux-audio-dev] low-latency scheduling patch for 2.4.0

2001-01-20 Thread yodaiken

On Fri, Jan 12, 2001 at 07:45:43PM -0700, Jay Ts wrote:
 Andrew Morton wrote:
  
  Jay Ts wrote:
   
   Now about the only thing left is to get it included
   in the standard kernel.  Do you think Linus Torvalds is more likely
   to accept these patches than Ingo's?  I sure hope this one works out.
  
  We (or "he") need to decide up-front that Linux is to become
  a low latency kernel. Then we need to decide the best way of
  doing that.
  
  Making the kernel internally preemptive is probably the best way of
  doing this.  But it's a *big* task
 
 Ouch.  Yes, I agree that the ideal path is for Linus and the other
 kernel developers and ... well, just about everyone ... is to create
 a long-range strategy and 'roadmap' that includes support for low-latency.
 
 And making the kernel preemptive might be the best way to do that
 (and I'm saying "might"...).

Keep in mind that Ken Thompson  Dennis Ritchie did not decide on a 
non-preemptive strategy for UNIX because they were unaware of such 
methods or because they were stupid. And when Rob Pike redesigned a new
"unix" Plan9  note there is no-preemptive kernel, and the core Linux
designers have rejected preemptive kernels too. Now it is certainly possible
that things have change and/or all these folks are just plain wrong. But
I wouldn't bet too much on it.

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-13 Thread yodaiken
  kmem_cache_t* cachep;
> + /* empty test measurement: */
> + printk(" kernel cpu benchmark started **\n");
> + clean_buf();
> + set_current_state(TASK_UNINTERRUPTIBLE);
> + schedule_timeout(200);
> + for(i=0;i<100;i++) {
> + start_measure();
> + return_immediately(NULL);
> + return_immediately(NULL);
> + return_immediately(NULL);
> + return_immediately(NULL);
> + end_measure();
> + }
> + print_buf("zero");
> + clean_buf();
> +
> + set_current_state(TASK_UNINTERRUPTIBLE);
> + schedule_timeout(200);
> + for(i=0;i<100;i++) {
> + start_measure();
> + return_immediately(NULL);
> + return_immediately(NULL);
> + smp_call_function(return_immediately,NULL,
> + 1, 1);
> + return_immediately(NULL);
> + return_immediately(NULL);
> + end_measure();
> + }
> + print_buf("empty smp_call_function()");
> + clean_buf();
> +
> + set_current_state(TASK_UNINTERRUPTIBLE);
> + schedule_timeout(200);
> + for(i=0;i<100;i++) {
> + start_measure();
> + return_immediately(NULL);
> + return_immediately(NULL);
> + smp_call_function(just_one_page,NULL,
> + 1, 1);
> + just_one_page(NULL);
> + return_immediately(NULL);
> + return_immediately(NULL);
> + end_measure();
> + }
> + print_buf("flush_one_page()");
> + clean_buf();
> +
> + return -EINVAL;
> + }
>  
>   dev_dummy.init = dummy_init;
>   SET_MODULE_OWNER(_dummy);
> 


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-13 Thread yodaiken
 + print_buf("zero");
 + clean_buf();
 +
 + set_current_state(TASK_UNINTERRUPTIBLE);
 + schedule_timeout(200);
 + for(i=0;i100;i++) {
 + start_measure();
 + return_immediately(NULL);
 + return_immediately(NULL);
 + smp_call_function(return_immediately,NULL,
 + 1, 1);
 + return_immediately(NULL);
 + return_immediately(NULL);
 + end_measure();
 + }
 + print_buf("empty smp_call_function()");
 + clean_buf();
 +
 + set_current_state(TASK_UNINTERRUPTIBLE);
 + schedule_timeout(200);
 + for(i=0;i100;i++) {
 + start_measure();
 + return_immediately(NULL);
 + return_immediately(NULL);
 + smp_call_function(just_one_page,NULL,
 + 1, 1);
 + just_one_page(NULL);
 + return_immediately(NULL);
 + return_immediately(NULL);
 + end_measure();
 + }
 + print_buf("flush_one_page()");
 + clean_buf();
 +
 + return -EINVAL;
 + }
  
   dev_dummy.init = dummy_init;
   SET_MODULE_OWNER(dev_dummy);
 


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Multithreaded locks.c

2000-11-04 Thread yodaiken

On Sun, Nov 05, 2000 at 10:57:57AM +1100, Andrew Morton wrote:
> Even the DG/UX manpage doesn't say what happens when you sidegrade
> the lock.  LOCK_EX->LOCK_EX :)

Suggested code:
 printk("Don't do that\n"); return -EKNUCKLEHEAD;


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Multithreaded locks.c

2000-11-04 Thread yodaiken

On Sun, Nov 05, 2000 at 10:57:57AM +1100, Andrew Morton wrote:
 Even the DG/UX manpage doesn't say what happens when you sidegrade
 the lock.  LOCK_EX-LOCK_EX :)

Suggested code:
 printk("Don't do that\n"); return -EKNUCKLEHEAD;


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Updated 2.4 TODO List -- new addition WAS(test9 PCI

2000-10-12 Thread yodaiken

On Thu, Oct 12, 2000 at 06:26:57AM -0400, Horst von Brand wrote:
> [EMAIL PROTECTED] said:
> > Foolhardy as it may be, people do _use_ the operating system to run
> > important applications and an "application goes down or screws up" can be
> > quite serious.
> 
> Yes. But "the kernel screws up and crashes" is more serious, as it takes
> _all_ applications with it. And if it is "screws up and scribbles on de
> disks" the losses are even much more serious.

Really? More serious than a multithreaded data base failing to synchronize
as promised? Get real. 
You can't do this type of ranking. If the compiler is bad, the entire 
system is a joke.  Users don't care whether the accounting system fails
and the telephone stops working because of a compiler error or a kernel
crash. 







> -- 
> Horst von Brand [EMAIL PROTECTED]
> Casilla 9G, Vin~a del Mar, Chile   +56 32 672616

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Updated 2.4 TODO List -- new addition WAS(test9 PCI

2000-10-12 Thread yodaiken

On Thu, Oct 12, 2000 at 06:26:57AM -0400, Horst von Brand wrote:
 [EMAIL PROTECTED] said:
  Foolhardy as it may be, people do _use_ the operating system to run
  important applications and an "application goes down or screws up" can be
  quite serious.
 
 Yes. But "the kernel screws up and crashes" is more serious, as it takes
 _all_ applications with it. And if it is "screws up and scribbles on de
 disks" the losses are even much more serious.

Really? More serious than a multithreaded data base failing to synchronize
as promised? Get real. 
You can't do this type of ranking. If the compiler is bad, the entire 
system is a joke.  Users don't care whether the accounting system fails
and the telephone stops working because of a compiler error or a kernel
crash. 







 -- 
 Horst von Brand [EMAIL PROTECTED]
 Casilla 9G, Vin~a del Mar, Chile   +56 32 672616

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Updated 2.4 TODO List -- new addition WAS(test9 PCI

2000-10-11 Thread yodaiken

On Wed, Oct 11, 2000 at 11:21:06PM -0400, Horst von Brand wrote:
> also moves forward a lot faster than glibc, and grows a lot. A bug in glibc
> means an application goes down or screws up, a bug in the kernel can mean
> masive data loss in no time at all.

Foolhardy as it may be, people do _use_ the operating system
to run important applications  and
an "application goes down or screws up" can be  quite serious.



-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Updated 2.4 TODO List -- new addition WAS(test9 PCI

2000-10-11 Thread yodaiken

On Wed, Oct 11, 2000 at 11:21:06PM -0400, Horst von Brand wrote:
 also moves forward a lot faster than glibc, and grows a lot. A bug in glibc
 means an application goes down or screws up, a bug in the kernel can mean
 masive data loss in no time at all.

Foolhardy as it may be, people do _use_ the operating system
to run important applications  and
an "application goes down or screws up" can be  quite serious.



-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-10 Thread yodaiken

On Mon, Oct 09, 2000 at 11:30:50PM +0100, Alan Cox wrote:
> > I think I'll go for the 'current is in a well-known register'
> > approach and see how this goes...
> 
> Failing that the 2.0 approach will work, current is a global in uniprocessor
> and a #define to an array indexed by cpu id in smp

Alan. What's the ballpark expense of doing the hard_smp_id (from apic) rather than
*(sp)

> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

On Mon, Oct 09, 2000 at 01:02:21AM +0200, Jamie Lokier wrote:
> [EMAIL PROTECTED] wrote:
> > Looking at the [network] code, I don't see any places where "current"
> > is not valid.
> > Got some examples? 
> 
> Damn I'm being dense tonight.  No, that bug was due to calling "current"
> from the wrong process context, not from an invalid context.
> (Self-flagellate, self-flaggelate).

Don't go overboard here. 
> 
> > BTW: there is an implicit reference to "current"  in smp_processor_id. 
> 
> Yes I forgot about that.  (Self-flagellate).  However that is
> architecture specific.  If it's not an SMP Vax port, no big deal.  If it

The entire concept of an SMP vax port leaves me disoriented.



-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

On Sun, Oct 08, 2000 at 03:58:55PM -0700, Mitchell Blank Jr wrote:
> [EMAIL PROTECTED] wrote:
> > Looking at the code, I don't see any places where "current" is not valid.
> > Got some examples? 
> 
> It's not that its invalid, it just doesn't make much sense.  It points to
> whatever task happened to be running when the interrupt happened.  So
> any attempt to access it is 99% likely to be a bug.

Bueno. 

> 
> > BTW: there is an implicit reference to "current"  in smp_processor_id. 
> 
> Yes, on architectures that use current->processor that is an exception
> to the rule.  After all, you know for sure that you're still on the
> same CPU as the task currently running.

This makes sense. And I wish cpu architects would put a cpu-id
register somewhere
so that we could have fast computation of cpu-id on smp machines.

> 
> -Mitch

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

Looking at the code, I don't see any places where "current" is not valid.
Got some examples? 

BTW: there is an implicit reference to "current"  in smp_processor_id. 



On Mon, Oct 09, 2000 at 12:30:17AM +0200, Jamie Lokier wrote:
> Kenn Humborg wrote:
> > My feeling is that interrupt code has no business calling current(),
> > but I don't know the kernel well enough to be sure.  Is there any
> > interrupt-level code that calls current() or is it a design
> > principle that it cannot be called?
> 
> It's a design principle that you must not call "current" in interrupt,
> bottom half or tasklet context.  From time to time buggy code is found
> to do this, and it gets away with it.  (See recent thread on network I/O
> signal delivery using the wrong credentials due to a bug like this).
> 
> So if you can make the machine crash utterly when calling "current" in
> irq context, or when dereferencing the result, that would probably be a
> good thing :-)
> 
> -- Jamie
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

Looking at the code, I don't see any places where "current" is not valid.
Got some examples? 

BTW: there is an implicit reference to "current"  in smp_processor_id. 



On Mon, Oct 09, 2000 at 12:30:17AM +0200, Jamie Lokier wrote:
 Kenn Humborg wrote:
  My feeling is that interrupt code has no business calling current(),
  but I don't know the kernel well enough to be sure.  Is there any
  interrupt-level code that calls current() or is it a design
  principle that it cannot be called?
 
 It's a design principle that you must not call "current" in interrupt,
 bottom half or tasklet context.  From time to time buggy code is found
 to do this, and it gets away with it.  (See recent thread on network I/O
 signal delivery using the wrong credentials due to a bug like this).
 
 So if you can make the machine crash utterly when calling "current" in
 irq context, or when dereferencing the result, that would probably be a
 good thing :-)
 
 -- Jamie
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

On Sun, Oct 08, 2000 at 03:58:55PM -0700, Mitchell Blank Jr wrote:
 [EMAIL PROTECTED] wrote:
  Looking at the code, I don't see any places where "current" is not valid.
  Got some examples? 
 
 It's not that its invalid, it just doesn't make much sense.  It points to
 whatever task happened to be running when the interrupt happened.  So
 any attempt to access it is 99% likely to be a bug.

Bueno. 

 
  BTW: there is an implicit reference to "current"  in smp_processor_id. 
 
 Yes, on architectures that use current-processor that is an exception
 to the rule.  After all, you know for sure that you're still on the
 same CPU as the task currently running.

This makes sense. And I wish cpu architects would put a cpu-id
register somewhere
so that we could have fast computation of cpu-id on smp machines.

 
 -Mitch

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Calling current() from interrupt context

2000-10-08 Thread yodaiken

On Mon, Oct 09, 2000 at 01:02:21AM +0200, Jamie Lokier wrote:
 [EMAIL PROTECTED] wrote:
  Looking at the [network] code, I don't see any places where "current"
  is not valid.
  Got some examples? 
 
 Damn I'm being dense tonight.  No, that bug was due to calling "current"
 from the wrong process context, not from an invalid context.
 (Self-flagellate, self-flaggelate).

Don't go overboard here. 
 
  BTW: there is an implicit reference to "current"  in smp_processor_id. 
 
 Yes I forgot about that.  (Self-flagellate).  However that is
 architecture specific.  If it's not an SMP Vax port, no big deal.  If it

The entire concept of an SMP vax port leaves me disoriented.



-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VM

2000-09-27 Thread yodaiken

On Wed, Sep 27, 2000 at 09:42:45AM +0200, Ingo Molnar wrote:
> 
> On Tue, 26 Sep 2000, Pavel Machek wrote:
> of the VM allocation issues. Returning NULL in kmalloc() is just a way to
> say: 'oops, we screwed up somewhere'. And i'd suggest to not work around

That is not at all how it is currently used in the kernel. 

> such screwups by checking for NULL and trying to handle it. I suggest to
> rather fix those screwups.

Kmalloc returns null when there is not enough memory to satisfy the request. What's
wrong with that?


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Mon, Sep 25, 2000 at 05:14:11PM -0600, Erik Andersen wrote:
> On Mon Sep 25, 2000 at 02:04:19PM -0600, [EMAIL PROTECTED] wrote:
> > 
> > > all of the pending requests just as long as they are serialised, is
> > > this a problem?
> > 
> > I think you are solving the wrong problem. On a small memory machine, the kernel,
> > utilities, and applications should be configured to use little memory.  
> > BusyBox is better than BeanCount. 
> > 
> 
> Granted that smaller apps can help -- for a particular workload.  But while I
> am very partial to BusyBox (in fact I am about to cut a new release) I can
> assure you that OOM is easily possible even when your user space is tiny.  I do
> it all the time.  There are mallocs in busybox and when under memory pressure,
> the kernel still tends to fall over...

Operating systems cannot make more memory appear by magic.
The question is really about the best strategy for dealing with low memory. In my
opinion, the OS should not try to out-think physical limitations. Instead, the OS 
should take as little space as possible and provide the ability for user level 
clever management of space. In a truly embedded system, there can easily be a user 
level
root process that watches memory usage and prevents DOS attacks -- if the OS provides
settable enforced quotas etc. 


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Tue, Sep 26, 2000 at 11:07:36AM +0100, Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, Sep 25, 2000 at 03:12:50PM -0600, [EMAIL PROTECTED] wrote:
> > > > 
> > > > I'm not too sure of what you have in mind, but if it is
> > > >  "process creates vast virtual space to generate many page table
> > > >   entries -- using mmap"
> > > > the answer is, virtual address space quotas and mmap should kill 
> > > > the process on low mem for page tables.
> > > 
> > > No.  Page tables are not freed after munmap (and for good reason).  The
> > > counting of page table "beans" is critical.
> > 
> > I've seen the assertion before, reasons would be interesting.
> 
> Reason 1: under DoS attack, you want to target not the process using
> the most resources, but the *user* using the most resources (else a
> fork-bomb style attack can work around your OOM-killer algorithms).

Ok.
  if(over_allocated_page_tables(task->uid) ) return ENOMEM;

makes sense in "fork".   I guess the argument here is not about whether
accounting is good, it's about where the accounting should be done. To me
the alternatives of

  if(preallocate_pages(page_table_size_for_this_process()) == -1)return error
 then actually allocate making sure to adjust counts if some other
 error turns up and with something taking care of how the pre-allocation
 works while we are sleeping waiting for possibly unrelated resources.

or
  just kmalloc with kmalloc magically juggling resources in some safe way


seem less clear.

   

 

> Reason 2: if you've got tasks stuck in low-level page allocation
> routines, then you can't immediately kill -9 them, so reactive OOM
> killing always has vulnerabilities --- to be robust in preventing
> resource exhaustion you want limits on the use of those resources
> before they are exhausted --- the necessary accounting being part of
> what we refer to as "beancounter".

doesn't the problem really come from low level page allocation at too high a level?
That is, if instead of select doing get_free_page, it maybe should do 
get_per_process_page(myprocess) or even get_per_process_file_use_page(myprocess)
Then we could have a config-optional per-process pinned page accounting with the 
possibility of doing something sensible in a user-space daemon when memory is low.

> 
> --Stephen

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Tue, Sep 26, 2000 at 10:54:23AM +0100, Stephen C. Tweedie wrote:
> Beancounter is a framework for user-level accounting.  _What_ you
> account is up to the callers.  Maybe this has been a miscommunication,
> but beancounter is all about allowing callers to account for stuff
> before allocation, not about having the page allocation functions
> themselves enforce quotas.


per-user and system-wide and per-process quotas are one thing, a
pre-allocate-and-then-allocate generic scheme seems to me to be a error prone
way of getting there. In particular, I think it is dangerous to have a pre-count that
is approximately tethered to the thing it is counting -- in the memory allocation 
we were discussing, you need to make sure that the pre-allocations are for memory that
is really going to be allocated soon and that it is later correlated with free in 
some way.  

So, to me, a quota bounded allocate_page_table(process_id) makes much more sense then 
pre-allocate counting, or, even worse, a "smart" kmalloc that never fails.
If the problem is unaccounted for page-tables then account for
page tables and return a  -EYOURPROCESSISOUTOFCONTROL so that calling kernel code
can take the responsible action. 
   

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Tue, Sep 26, 2000 at 10:54:23AM +0100, Stephen C. Tweedie wrote:
 Beancounter is a framework for user-level accounting.  _What_ you
 account is up to the callers.  Maybe this has been a miscommunication,
 but beancounter is all about allowing callers to account for stuff
 before allocation, not about having the page allocation functions
 themselves enforce quotas.


per-user and system-wide and per-process quotas are one thing, a
pre-allocate-and-then-allocate generic scheme seems to me to be a error prone
way of getting there. In particular, I think it is dangerous to have a pre-count that
is approximately tethered to the thing it is counting -- in the memory allocation 
we were discussing, you need to make sure that the pre-allocations are for memory that
is really going to be allocated soon and that it is later correlated with free in 
some way.  

So, to me, a quota bounded allocate_page_table(process_id) makes much more sense then 
pre-allocate counting, or, even worse, a "smart" kmalloc that never fails.
If the problem is unaccounted for page-tables then account for
page tables and return a  -EYOURPROCESSISOUTOFCONTROL so that calling kernel code
can take the responsible action. 
   

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Tue, Sep 26, 2000 at 11:07:36AM +0100, Stephen C. Tweedie wrote:
 Hi,
 
 On Mon, Sep 25, 2000 at 03:12:50PM -0600, [EMAIL PROTECTED] wrote:

I'm not too sure of what you have in mind, but if it is
 "process creates vast virtual space to generate many page table
  entries -- using mmap"
the answer is, virtual address space quotas and mmap should kill 
the process on low mem for page tables.
   
   No.  Page tables are not freed after munmap (and for good reason).  The
   counting of page table "beans" is critical.
  
  I've seen the assertion before, reasons would be interesting.
 
 Reason 1: under DoS attack, you want to target not the process using
 the most resources, but the *user* using the most resources (else a
 fork-bomb style attack can work around your OOM-killer algorithms).

Ok.
  if(over_allocated_page_tables(task-uid) ) return ENOMEM;

makes sense in "fork".   I guess the argument here is not about whether
accounting is good, it's about where the accounting should be done. To me
the alternatives of

  if(preallocate_pages(page_table_size_for_this_process()) == -1)return error
 then actually allocate making sure to adjust counts if some other
 error turns up and with something taking care of how the pre-allocation
 works while we are sleeping waiting for possibly unrelated resources.

or
  just kmalloc with kmalloc magically juggling resources in some safe way


seem less clear.

   

 

 Reason 2: if you've got tasks stuck in low-level page allocation
 routines, then you can't immediately kill -9 them, so reactive OOM
 killing always has vulnerabilities --- to be robust in preventing
 resource exhaustion you want limits on the use of those resources
 before they are exhausted --- the necessary accounting being part of
 what we refer to as "beancounter".

doesn't the problem really come from low level page allocation at too high a level?
That is, if instead of select doing get_free_page, it maybe should do 
get_per_process_page(myprocess) or even get_per_process_file_use_page(myprocess)
Then we could have a config-optional per-process pinned page accounting with the 
possibility of doing something sensible in a user-space daemon when memory is low.

 
 --Stephen

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-26 Thread yodaiken

On Mon, Sep 25, 2000 at 05:14:11PM -0600, Erik Andersen wrote:
 On Mon Sep 25, 2000 at 02:04:19PM -0600, [EMAIL PROTECTED] wrote:
  
   all of the pending requests just as long as they are serialised, is
   this a problem?
  
  I think you are solving the wrong problem. On a small memory machine, the kernel,
  utilities, and applications should be configured to use little memory.  
  BusyBox is better than BeanCount. 
  
 
 Granted that smaller apps can help -- for a particular workload.  But while I
 am very partial to BusyBox (in fact I am about to cut a new release) I can
 assure you that OOM is easily possible even when your user space is tiny.  I do
 it all the time.  There are mallocs in busybox and when under memory pressure,
 the kernel still tends to fall over...

Operating systems cannot make more memory appear by magic.
The question is really about the best strategy for dealing with low memory. In my
opinion, the OS should not try to out-think physical limitations. Instead, the OS 
should take as little space as possible and provide the ability for user level 
clever management of space. In a truly embedded system, there can easily be a user 
level
root process that watches memory usage and prevents DOS attacks -- if the OS provides
settable enforced quotas etc. 


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 04:47:21PM -0400, Benjamin C.R. LaHaise wrote:
> On Mon, 25 Sep 2000 [EMAIL PROTECTED] wrote:
> 
> > On Mon, Sep 25, 2000 at 09:23:48PM +0100, Alan Cox wrote:
> > > > my prediction is that if you show me an example of 
> > > > DoS vulnerability,  I can show you fix that does not require bean counting.
> > > > Am I wrong?
> > > 
> > > I think so. Page tables are a good example
> > 
> > I'm not too sure of what you have in mind, but if it is
> >  "process creates vast virtual space to generate many page table
> >   entries -- using mmap"
> > the answer is, virtual address space quotas and mmap should kill 
> > the process on low mem for page tables.
> 
> No.  Page tables are not freed after munmap (and for good reason).  The
> counting of page table "beans" is critical.

I've seen the assertion before, reasons would be interesting.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 09:46:35PM +0100, Alan Cox wrote:
> > I'm not too sure of what you have in mind, but if it is
> >  "process creates vast virtual space to generate many page table
> >   entries -- using mmap"
> > the answer is, virtual address space quotas and mmap should kill 
> > the process on low mem for page tables.
> 
> Those quotas being exactly what beancounter is

But that is a function specific counter, not a counter in the 
alloc code.


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 09:23:48PM +0100, Alan Cox wrote:
> > my prediction is that if you show me an example of 
> > DoS vulnerability,  I can show you fix that does not require bean counting.
> > Am I wrong?
> 
> I think so. Page tables are a good example

I'm not too sure of what you have in mind, but if it is
 "process creates vast virtual space to generate many page table
  entries -- using mmap"
the answer is, virtual address space quotas and mmap should kill 
the process on low mem for page tables.

> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 08:25:49PM +0100, Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, Sep 25, 2000 at 12:34:56PM -0600, [EMAIL PROTECTED] wrote:
> 
> > > > Process 1,2 and 3 all start allocating 20 pages
> > > > now 57 pages are locked up in non-swapable kernel space and the system 
>deadlocks OOM.
> > > 
> > > Or go the beancounter route: process 1 asks "can I pin 20 pages", gets
> > > told "yes", and goes allocating them, blocking as necessary until it
> > 
> > So you have a "pre-allocation allocator"?  Leads to interesting and hard to detect
> > bugs with old code that does not pre-allocate or with code that incorrectly 
>pre-allocates
> > or that blocks on something unrelated
> 
> Right, but if the alternative is spurious ENOMEM when we can satisfy

An ENOMEM is not spurious if there is not enough memory. UNIX does not ask the
OS to do impossible tricks.

> all of the pending requests just as long as they are serialised, is
> this a problem?

I think you are solving the wrong problem. On a small memory machine, the kernel,
utilities, and applications should be configured to use little memory.  
BusyBox is better than BeanCount. 


> However, you just can't escape from the fact that on low memory
> machinnes, we *need* beancounter-style accounting of pinned pages or
> we'll be in Deep Trouble (TM).  We already have nasty DoS situations

What we need is simple kernel code that does not hold resources
into a  possible deadlock situation. 

> which are embarassingly easy to reproduce.  If we need such
> beancounter protection, AND such protection can prevent the situation
> you describe, then do we need to go looking for another way of
> achieving the same protection?


On general principles, I don't see any substitute for clean code in the kernel and
my prediction is that if you show me an example of 
DoS vulnerability,  I can show you fix that does not require bean counting.
Am I wrong?





-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 07:24:53PM +0100, Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, Sep 25, 2000 at 12:13:15PM -0600, [EMAIL PROTECTED] wrote:
> 
> > > Definitely not.  GFP_ATOMIC is reserved for things that really can't
> > > swap or schedule right now.  Use GFP_ATOMIC indiscriminately and you'll
> > > have to increase the number of atomic-allocatable pages.
> > 
> > Process 1,2 and 3 all start allocating 20 pages
> >   process 1 stalls after allocating 19
> >   some memory is freed and process 2 runs and stall after allocating 19
> >   some memory is free and process 3 runs and stalls after allocating 19
> >  
> > now 57 pages are locked up in non-swapable kernel space and the system 
>deadlocks OOM.
> 
> Or go the beancounter route: process 1 asks "can I pin 20 pages", gets
> told "yes", and goes allocating them, blocking as necessary until it

So you have a "pre-allocation allocator"?  Leads to interesting and hard to detect
bugs with old code that does not pre-allocate or with code that incorrectly 
pre-allocates
or that blocks on something unrelated

   preallocte 20 pages
   get first
   ask for an inode -- block waiting for an inode


or
   preallocate 20 pages
   if(checkuserpath())return -ENOWAY; /* stranding my pre-allocate */
   else get them pages


What's nice about these is they don't cause errors on test and seem more 
difficult to spot than looking for cases where allocated memory gets stranded.
Doesn't the alloc_vec method seem simpler to you?

> gets them.  Process 2 asks "can *I* pin 20 pages" and the answer is
> either "not right now", in which case it waits for process 1 to
> release its reservation, or "no, you've exceeded your user quota" in

Or for someone else to free more pages ... 

> which case it fails with ENOMEM.  (That latter case can protect us
> against a lot of DoS attacks from local users.)

I like ENOMEM anyways.

> 
> The same accounting really needs to be done for page tables, as that
> represents one of the biggest sources of unaccounted, unswappable
> pages which user processes can cause to be created right now.



-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 08:04:54PM +0200, Jamie Lokier wrote:
> [EMAIL PROTECTED] wrote:
> > > [EMAIL PROTECTED] wrote:
> > > >walk = out;
> > > > while(nfds > 0) {
> > > > poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
> > > > if (!tmp) {
> > > 
> > > Shouldn't this be GFP_USER?  (Which would also conveniently fix the
> > > problem Victor's pointing out...)
> > 
> > It should probably be GFP_ATOMIC, if I understand the mm right. 
> 
> Definitely not.  GFP_ATOMIC is reserved for things that really can't
> swap or schedule right now.  Use GFP_ATOMIC indiscriminately and you'll
> have to increase the number of atomic-allocatable pages.

Process 1,2 and 3 all start allocating 20 pages
  process 1 stalls after allocating 19
  some memory is freed and process 2 runs and stall after allocating 19
  some memory is free and process 3 runs and stalls after allocating 19
 
now 57 pages are locked up in non-swapable kernel space and the system deadlocks 
OOM.



> > The algorithm for requesting a collection of reources and freeing all
> > of them on failure is simple, fast, and robust.
> 
> Allocation is just as fast with GFP_KERNEL/USER, just less likely to

It's not speed, it's deadlock avoidance. 

> fail and less likely to break something else that really needs
> GFP_ATOMIC allocations.

My point here is simply that error returns in memory allocation allow 
higher level kernel operations to safely marshal a collection of resources following
a safe algorithm that is optimized for the case when there is no memory shortage
and that only starts going to the slow case when the system is stalling due to memory
shortages anyways.



> 
> -- Jamie

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 07:18:29PM +0200, Jamie Lokier wrote:
> [EMAIL PROTECTED] wrote:
> >walk = out;
> > while(nfds > 0) {
> > poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
> > if (!tmp) {
> 
> Shouldn't this be GFP_USER?  (Which would also conveniently fix the
> problem Victor's pointing out...)

It should probably be GFP_ATOMIC, if I understand the mm right. 

The algorithm for requesting a collection of reources and freeing all of them
 on failure is simple, fast, and robust. 


  

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 04:42:49PM +0100, Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, Sep 25, 2000 at 04:16:56PM +0100, Alan Cox wrote:
> > 
> > Unless Im missing something here think about this case
> > 
> > 2 active processes, no swap
> > 
> > #1  #2
> > kmalloc 32K kmalloc 16K
> > OK  OK
> > kmalloc 16K kmalloc 32K
> > block   block
> > 
> 
> ... and we get two wakeup_kswapd()s.  kswapd has PF_MEMALLOC and so is
> able to eat memory which processes #1 and #2 are not allowed to touch.
> Progress is made, clean pages are discarded and dirty ones queued for
> write, memory becomes free again and the world is a better place.
> 
> Or so goes the theory, at least.

from fs/select.c

   walk = out;
while(nfds > 0) {
poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
if (!tmp) {
while(out != NULL) {
tmp = out->next;
free_page((unsigned long)out);
out = tmp;
}
return NULL;
}
tmp->nr = 0;
tmp->entry = (struct poll_table_entry *)(tmp + 1);
tmp->next = NULL;
walk->next = tmp;
walk = tmp;
nfds -=__MAX_POLL_TABLE_ENTRIES;
}


> 
> --Stephen
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VM

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 05:26:59PM +0200, Ingo Molnar wrote:
> 
> On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
> 
> > > i think the GFP_USER case should do the oom logic within __alloc_pages(),
> > 
> > What's the difference of implementing the logic outside alloc_pages?
> > Putting the logic inside looks not clean design to me.
> 
> it gives consistency and simplicity. The allocators themselves do not have
> to care about oom.


There are many cases where it is simple to do:

  if( alloc(r1) == fail) goto freeall
  if( alloc(r2) == fail) goto freeall
  if( alloc(r3) == fail) goto freeall

And the alloc functions don't know how to "freeall".

Perhaps it would be good to do an alloc_vec allocation in these cases.
  alloc_vec[0].size = n;
  ..
  alloc_vec[n].size = 0;
  if(kmalloc_all(alloc_vec) == FAIL) return -ENOMEM;
  else  alloc_vec[i].ptr is the pointer.




-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 08:25:49PM +0100, Stephen C. Tweedie wrote:
 Hi,
 
 On Mon, Sep 25, 2000 at 12:34:56PM -0600, [EMAIL PROTECTED] wrote:
 
Process 1,2 and 3 all start allocating 20 pages
now 57 pages are locked up in non-swapable kernel space and the system 
deadlocks OOM.
   
   Or go the beancounter route: process 1 asks "can I pin 20 pages", gets
   told "yes", and goes allocating them, blocking as necessary until it
  
  So you have a "pre-allocation allocator"?  Leads to interesting and hard to detect
  bugs with old code that does not pre-allocate or with code that incorrectly 
pre-allocates
  or that blocks on something unrelated
 
 Right, but if the alternative is spurious ENOMEM when we can satisfy

An ENOMEM is not spurious if there is not enough memory. UNIX does not ask the
OS to do impossible tricks.

 all of the pending requests just as long as they are serialised, is
 this a problem?

I think you are solving the wrong problem. On a small memory machine, the kernel,
utilities, and applications should be configured to use little memory.  
BusyBox is better than BeanCount. 


 However, you just can't escape from the fact that on low memory
 machinnes, we *need* beancounter-style accounting of pinned pages or
 we'll be in Deep Trouble (TM).  We already have nasty DoS situations

What we need is simple kernel code that does not hold resources
into a  possible deadlock situation. 

 which are embarassingly easy to reproduce.  If we need such
 beancounter protection, AND such protection can prevent the situation
 you describe, then do we need to go looking for another way of
 achieving the same protection?


On general principles, I don't see any substitute for clean code in the kernel and
my prediction is that if you show me an example of 
DoS vulnerability,  I can show you fix that does not require bean counting.
Am I wrong?





-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 07:24:53PM +0100, Stephen C. Tweedie wrote:
 Hi,
 
 On Mon, Sep 25, 2000 at 12:13:15PM -0600, [EMAIL PROTECTED] wrote:
 
   Definitely not.  GFP_ATOMIC is reserved for things that really can't
   swap or schedule right now.  Use GFP_ATOMIC indiscriminately and you'll
   have to increase the number of atomic-allocatable pages.
  
  Process 1,2 and 3 all start allocating 20 pages
process 1 stalls after allocating 19
some memory is freed and process 2 runs and stall after allocating 19
some memory is free and process 3 runs and stalls after allocating 19
   
  now 57 pages are locked up in non-swapable kernel space and the system 
deadlocks OOM.
 
 Or go the beancounter route: process 1 asks "can I pin 20 pages", gets
 told "yes", and goes allocating them, blocking as necessary until it

So you have a "pre-allocation allocator"?  Leads to interesting and hard to detect
bugs with old code that does not pre-allocate or with code that incorrectly 
pre-allocates
or that blocks on something unrelated

   preallocte 20 pages
   get first
   ask for an inode -- block waiting for an inode


or
   preallocate 20 pages
   if(checkuserpath())return -ENOWAY; /* stranding my pre-allocate */
   else get them pages


What's nice about these is they don't cause errors on test and seem more 
difficult to spot than looking for cases where allocated memory gets stranded.
Doesn't the alloc_vec method seem simpler to you?

 gets them.  Process 2 asks "can *I* pin 20 pages" and the answer is
 either "not right now", in which case it waits for process 1 to
 release its reservation, or "no, you've exceeded your user quota" in

Or for someone else to free more pages ... 

 which case it fails with ENOMEM.  (That latter case can protect us
 against a lot of DoS attacks from local users.)

I like ENOMEM anyways.

 
 The same accounting really needs to be done for page tables, as that
 represents one of the biggest sources of unaccounted, unswappable
 pages which user processes can cause to be created right now.



-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VM

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 05:26:59PM +0200, Ingo Molnar wrote:
 
 On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
 
   i think the GFP_USER case should do the oom logic within __alloc_pages(),
  
  What's the difference of implementing the logic outside alloc_pages?
  Putting the logic inside looks not clean design to me.
 
 it gives consistency and simplicity. The allocators themselves do not have
 to care about oom.


There are many cases where it is simple to do:

  if( alloc(r1) == fail) goto freeall
  if( alloc(r2) == fail) goto freeall
  if( alloc(r3) == fail) goto freeall

And the alloc functions don't know how to "freeall".

Perhaps it would be good to do an alloc_vec allocation in these cases.
  alloc_vec[0].size = n;
  ..
  alloc_vec[n].size = 0;
  if(kmalloc_all(alloc_vec) == FAIL) return -ENOMEM;
  else  alloc_vec[i].ptr is the pointer.




-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 04:42:49PM +0100, Stephen C. Tweedie wrote:
 Hi,
 
 On Mon, Sep 25, 2000 at 04:16:56PM +0100, Alan Cox wrote:
  
  Unless Im missing something here think about this case
  
  2 active processes, no swap
  
  #1  #2
  kmalloc 32K kmalloc 16K
  OK  OK
  kmalloc 16K kmalloc 32K
  block   block
  
 
 ... and we get two wakeup_kswapd()s.  kswapd has PF_MEMALLOC and so is
 able to eat memory which processes #1 and #2 are not allowed to touch.
 Progress is made, clean pages are discarded and dirty ones queued for
 write, memory becomes free again and the world is a better place.
 
 Or so goes the theory, at least.

from fs/select.c

   walk = out;
while(nfds  0) {
poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
if (!tmp) {
while(out != NULL) {
tmp = out-next;
free_page((unsigned long)out);
out = tmp;
}
return NULL;
}
tmp-nr = 0;
tmp-entry = (struct poll_table_entry *)(tmp + 1);
tmp-next = NULL;
walk-next = tmp;
walk = tmp;
nfds -=__MAX_POLL_TABLE_ENTRIES;
}


 
 --Stephen
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 07:18:29PM +0200, Jamie Lokier wrote:
 [EMAIL PROTECTED] wrote:
 walk = out;
  while(nfds  0) {
  poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
  if (!tmp) {
 
 Shouldn't this be GFP_USER?  (Which would also conveniently fix the
 problem Victor's pointing out...)

It should probably be GFP_ATOMIC, if I understand the mm right. 

The algorithm for requesting a collection of reources and freeing all of them
 on failure is simple, fast, and robust. 


  

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 08:04:54PM +0200, Jamie Lokier wrote:
 [EMAIL PROTECTED] wrote:
   [EMAIL PROTECTED] wrote:
   walk = out;
while(nfds  0) {
poll_table *tmp = (poll_table *) __get_free_page(GFP_KERNEL);
if (!tmp) {
   
   Shouldn't this be GFP_USER?  (Which would also conveniently fix the
   problem Victor's pointing out...)
  
  It should probably be GFP_ATOMIC, if I understand the mm right. 
 
 Definitely not.  GFP_ATOMIC is reserved for things that really can't
 swap or schedule right now.  Use GFP_ATOMIC indiscriminately and you'll
 have to increase the number of atomic-allocatable pages.

Process 1,2 and 3 all start allocating 20 pages
  process 1 stalls after allocating 19
  some memory is freed and process 2 runs and stall after allocating 19
  some memory is free and process 3 runs and stalls after allocating 19
 
now 57 pages are locked up in non-swapable kernel space and the system deadlocks 
OOM.



  The algorithm for requesting a collection of reources and freeing all
  of them on failure is simple, fast, and robust.
 
 Allocation is just as fast with GFP_KERNEL/USER, just less likely to

It's not speed, it's deadlock avoidance. 

 fail and less likely to break something else that really needs
 GFP_ATOMIC allocations.

My point here is simply that error returns in memory allocation allow 
higher level kernel operations to safely marshal a collection of resources following
a safe algorithm that is optimized for the case when there is no memory shortage
and that only starts going to the slow case when the system is stalling due to memory
shortages anyways.



 
 -- Jamie

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 09:46:35PM +0100, Alan Cox wrote:
  I'm not too sure of what you have in mind, but if it is
   "process creates vast virtual space to generate many page table
entries -- using mmap"
  the answer is, virtual address space quotas and mmap should kill 
  the process on low mem for page tables.
 
 Those quotas being exactly what beancounter is

But that is a function specific counter, not a counter in the 
alloc code.


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 04:47:21PM -0400, Benjamin C.R. LaHaise wrote:
 On Mon, 25 Sep 2000 [EMAIL PROTECTED] wrote:
 
  On Mon, Sep 25, 2000 at 09:23:48PM +0100, Alan Cox wrote:
my prediction is that if you show me an example of 
DoS vulnerability,  I can show you fix that does not require bean counting.
Am I wrong?
   
   I think so. Page tables are a good example
  
  I'm not too sure of what you have in mind, but if it is
   "process creates vast virtual space to generate many page table
entries -- using mmap"
  the answer is, virtual address space quotas and mmap should kill 
  the process on low mem for page tables.
 
 No.  Page tables are not freed after munmap (and for good reason).  The
 counting of page table "beans" is critical.

I've seen the assertion before, reasons would be interesting.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: the new VMt

2000-09-25 Thread yodaiken

On Mon, Sep 25, 2000 at 09:23:48PM +0100, Alan Cox wrote:
  my prediction is that if you show me an example of 
  DoS vulnerability,  I can show you fix that does not require bean counting.
  Am I wrong?
 
 I think so. Page tables are a good example

I'm not too sure of what you have in mind, but if it is
 "process creates vast virtual space to generate many page table
  entries -- using mmap"
the answer is, virtual address space quotas and mmap should kill 
the process on low mem for page tables.

 
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Availability of kdb

2000-09-18 Thread yodaiken

On Sun, Sep 17, 2000 at 08:40:33PM -0700, Larry McVoy wrote:
> On Sun, Sep 17, 2000 at 02:33:40PM -0700, Marty Fouts wrote:
> I'm sort of in the middle.  I know BitKeeper very well, and it's actually
> a larger wad of code than the kernel if you toss out the device drivers.
> About the only thing I ever want a debugger for is a stacktrace back.  If
> you give me that, I usually don't need anything else; and in general, you

There are debuggers that do other stuff too?  I gotta read the adb 
manual some day.

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Availability of kdb

2000-09-18 Thread yodaiken

On Sun, Sep 17, 2000 at 08:40:33PM -0700, Larry McVoy wrote:
 On Sun, Sep 17, 2000 at 02:33:40PM -0700, Marty Fouts wrote:
 I'm sort of in the middle.  I know BitKeeper very well, and it's actually
 a larger wad of code than the kernel if you toss out the device drivers.
 About the only thing I ever want a debugger for is a stacktrace back.  If
 you give me that, I usually don't need anything else; and in general, you

There are debuggers that do other stuff too?  I gotta read the adb 
manual some day.

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: More on 2.2.18pre2aa2

2000-09-16 Thread yodaiken

On Sat, Sep 16, 2000 at 02:15:16PM +0200, Jamie Lokier wrote:
> [EMAIL PROTECTED] wrote:
> > > Sure the global system is slower.  But the "interactive feel" is faster.
> > 
> > Let's pop up little buttons to make it "feel" faster.
> 
> If little buttons pop up quickly when I click on them, then yes that's
> better interactive feel.  Sometimes the disk is involved in this.
> 
> Nobody has a problem with tuning the task scheduler to work this way.

If we played some "zippy" music that would add to the "feel". Of course, we could 
actually
use benchmarks instead. 
And, to me, if kernel compiles take longer, I don't care how fast it "feels".

> 
> -- Jamie

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: More on 2.2.18pre2aa2

2000-09-16 Thread yodaiken

On Sat, Sep 16, 2000 at 02:15:16PM +0200, Jamie Lokier wrote:
 [EMAIL PROTECTED] wrote:
   Sure the global system is slower.  But the "interactive feel" is faster.
  
  Let's pop up little buttons to make it "feel" faster.
 
 If little buttons pop up quickly when I click on them, then yes that's
 better interactive feel.  Sometimes the disk is involved in this.
 
 Nobody has a problem with tuning the task scheduler to work this way.

If we played some "zippy" music that would add to the "feel". Of course, we could 
actually
use benchmarks instead. 
And, to me, if kernel compiles take longer, I don't care how fast it "feels".

 
 -- Jamie

-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: (reiserfs) Re: More on 2.2.18pre2aa2

2000-09-15 Thread yodaiken

On Tue, Sep 12, 2000 at 04:30:32PM +0200, Jamie Lokier wrote:
> Sure the global system is slower.  But the "interactive feel" is faster.

Let's pop up little buttons to make it "feel" faster.


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-14 Thread yodaiken

On Thu, Sep 14, 2000 at 10:26:08PM -0400, jamal wrote:
> 
> 
> One of the things we need to measure still is the latency. The scheme
> currently used with dynamically adjusting the mitigation parameters might
> not affect latency much -- simply because the adjustement is based on the
> load. We still have to prove this. The theory is:
> Under a lot of congestion, you delay longer because the layers above
> you are congested as gauged from a feedback; and under low congestion, you
> should theoretically adjust all the way down to 1 interupt/packet. Under
> heavy load, your latency is already screwed anyways because of large
> backlog queue; this is regardless of mitigation.

Or maybe the extra delay in congested circumstances will cause more 
timeouts and that's precisely when you need to improve latency?


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-14 Thread yodaiken

On Thu, Sep 14, 2000 at 10:26:08PM -0400, jamal wrote:
 
 
 One of the things we need to measure still is the latency. The scheme
 currently used with dynamically adjusting the mitigation parameters might
 not affect latency much -- simply because the adjustement is based on the
 load. We still have to prove this. The theory is:
 Under a lot of congestion, you delay longer because the layers above
 you are congested as gauged from a feedback; and under low congestion, you
 should theoretically adjust all the way down to 1 interupt/packet. Under
 heavy load, your latency is already screwed anyways because of large
 backlog queue; this is regardless of mitigation.

Or maybe the extra delay in congested circumstances will cause more 
timeouts and that's precisely when you need to improve latency?


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [ANNOUNCE] Withdrawl of Open Source NDS Project/NTFS/M2FS for Linux

2000-09-05 Thread yodaiken

On Wed, Sep 06, 2000 at 12:31:55PM +1200, Chris Wedgwood wrote:
> Perhaps you would like to describe how you do debug the kernel? I ask

I find that rebooting the machine and cursing myself is one of the
most effective kernel debugging methods.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [ANNOUNCE] Withdrawl of Open Source NDS Project/NTFS/M2FS for Linux

2000-09-05 Thread yodaiken

On Wed, Sep 06, 2000 at 12:31:55PM +1200, Chris Wedgwood wrote:
 Perhaps you would like to describe how you do debug the kernel? I ask

I find that rebooting the machine and cursing myself is one of the
most effective kernel debugging methods.


-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: thread group comments

2000-09-04 Thread yodaiken

On Sat, Sep 02, 2000 at 03:12:40AM +0200, Andi Kleen wrote:
> On Fri, Sep 01, 2000 at 06:11:05PM -0700, Ulrich Drepper wrote:
> > "Andi Kleen" <[EMAIL PROTECTED]> writes:
> > 
> > > Do you think the SA_NOCLDWAIT/queued exit signal approach makes sense ? 
> > 
> > I'm not sure whether it's worth the effort.  But I'm saying this now
> > looking at the code for another implementation following the 1:1 model.
> 
> So you have a different way to implement pthread_create without context
> switch to the thread manager in 2.4 ? 

What is the problem with the often proposed CLONE_PARENT solution?


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: thread group comments

2000-09-04 Thread yodaiken

On Sat, Sep 02, 2000 at 03:12:40AM +0200, Andi Kleen wrote:
 On Fri, Sep 01, 2000 at 06:11:05PM -0700, Ulrich Drepper wrote:
  "Andi Kleen" [EMAIL PROTECTED] writes:
  
   Do you think the SA_NOCLDWAIT/queued exit signal approach makes sense ? 
  
  I'm not sure whether it's worth the effort.  But I'm saying this now
  looking at the code for another implementation following the 1:1 model.
 
 So you have a different way to implement pthread_create without context
 switch to the thread manager in 2.4 ? 

What is the problem with the often proposed CLONE_PARENT solution?


-- 
-----
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: SCO: "thread creation is about a thousand times faster than

2000-08-29 Thread yodaiken

On Tue, Aug 29, 2000 at 02:12:17PM +0200, Marc Lehmann wrote:
> On Mon, Aug 28, 2000 at 09:20:49AM -0600, [EMAIL PROTECTED] wrote:
> > You can't rely on signals timing anyway -- that is quite clear in the
> > spec and in the implementation.
> 
> there is no "spec" on how it should be done. Again, it is about security and
> "doing it as right as possible" and not "according to POSICKS we can do
> whatever we want".

Implementing signals with good worst case 
timing seems very implausible to me. We don't even
have timing guarantees for signals in RTLinux where the kernel is simple.
The main reason is SMP -- if the target thread/process is running on a
second CPU, you need to send a IPI and then have the remote processor
run signal code. But ipis are relatively slow and the remote processor
may have interrupts disabled and ...

> 
> Just necause some standard says we needn't does not mean that we could do
> it better.
> 
> > Especially on a SMP machine, STOP has weak semantics and I don't see how
> > to imrove it.
> 
> It is possible to get "good enough" semantics with the

"good enough" for security? 
On processor A a SIG_STOP is issued to a thread-group  while on processor
B a thread element has just entered write. How many bytes written is
"good enough".


I've really exceeded my quota for this news group and will respond
privately if you want to continue.

-- 
-
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/