Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)

2000-10-11 Thread Bruce A. Locke

On Wed, 11 Oct 2000, Paul Jakma wrote:

> that's why you have per process limits set. Eg, PAM makes this
> exceedingly easy with pam_limit.so -> edit /etc/security/limit.conf.
> 
> this prevents at least 90% of OOM situations (ie individual leaky
> processes). eg netscape will then pop-up "can not allocate memory"
> messages and stop rendering pages instead of crashing your system.

I wasn't aware PAM settings affected daemons started up during boottime
but I will check into it, thank you.

BTW, you said it works only 90%, what are the other 10% of times it
doesn't work?

> 
> --paulj
> 

------
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)

2000-10-11 Thread Bruce A. Locke

On Thu, 12 Oct 2000, Matthew Hawkins wrote:

> Yep, for not setting appropriate resource limits.
> 
> man 2 setrlimit
> 
> Of course, if its a kernel bug that causes it I think you're SOL ;)

This manpage shows me functions and structs.  I'm assuming you want these
used by the offending program or the shell under which the program is
being called.  In the first case, a person might not have source to the
program and if thats the case, it doesn't help much.  And in the second
case, if the shell sets it, does it affect children of a process (aka
fork()'d)?  

Thanks for yout time...

> 
> -- 
> Matt
> 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)

2000-10-11 Thread Bruce A. Locke


Your making the deadly assumption that all applications behave themselves
exactly the same all the time.  Oops... netscape decided to freak out and
take up all your memory... guess its the admins fault.  Oops... some
mod_perl script decided to freak out and an apache process decides to suck
all of your CPU and MEM.

Crap like this does happen.  An example of this is a webboard package
called "Blackboard" consisting of various mod_perl scripts, apache, and
mysql. It is an educational online conferencing system being used in
conjunction with many college classes and thus is quite vital to the
campus.  

Unfortunatly its buggy as hell and the memory sucking bug didn't pop up
until we were a couple weeks into classes and locked into the system.  A
mod_perl script freaks out, the copy of apache goes nuts, and we get a
bunch of lovely out of memory related messages to the console.  Its times
like these that an OOM killer like Rik's would be very useful.  I feel
Rik's OOM backported to 2.2.x would do wonders for situation.  After
playing with Rik's OOM system, I know it would do the right thing on this
system but unfortunatly 2.4.x isn't trustworthy yet

Yes, the software is buggy and should be fixed.  Do I have the power to
fix a broken commerical package that I'm locked into?  No.

The point of an OOM killer is if all hell breaks loose and you have a
choice between a locked up system, a system thats slow as hell because its
spending all its time swapping, or a system that kills the offender and
gets back to buisness.  I choose the third option.  I can't think of any
situation (either on desktop or server) where a system lockup or panic due
to OOM would be acceptible w/ 2.4.x.


On Thu, 12 Oct 2000, Matthew Hawkins wrote:

> 
> Heh.. now all we need is some smart-arse to make something similar to
> apply to the _entire_ VM subsystem, and both Rik and Andrea can be happy
> ;)
> 
> Seriously, am I missing something obvious or is it far simpler just to
> keel over and die if the system goes OOM?  I mean, seriously, if the
> administrator lets it get to that state then he/she/it deserves a dead
> system.  It's akin to having your car run out of petrol - you don't
> start shooting passengers because their extra load made the engine chew
> more.  You pack up your kitty and go to the nearest petrol station and
> buy more, plug it into the car then learn from the experience so this
> fringe case of it happening doesn't happen again.  I don't really see
> much difference between a car going "OOP" and a computer going OOM.
> Should we start deleting files according to some randomly-chosen
> heueristic if a filesystem goes "OOS" ?
> 
> -- 
> Matt
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)

2000-10-11 Thread Bruce A. Locke


Your making the deadly assumption that all applications behave themselves
exactly the same all the time.  Oops... netscape decided to freak out and
take up all your memory... guess its the admins fault.  Oops... some
mod_perl script decided to freak out and an apache process decides to suck
all of your CPU and MEM.

Crap like this does happen.  An example of this is a webboard package
called "Blackboard" consisting of various mod_perl scripts, apache, and
mysql. It is an educational online conferencing system being used in
conjunction with many college classes and thus is quite vital to the
campus.  

Unfortunatly its buggy as hell and the memory sucking bug didn't pop up
until we were a couple weeks into classes and locked into the system.  A
mod_perl script freaks out, the copy of apache goes nuts, and we get a
bunch of lovely out of memory related messages to the console.  Its times
like these that an OOM killer like Rik's would be very useful.  I feel
Rik's OOM backported to 2.2.x would do wonders for situation.  After
playing with Rik's OOM system, I know it would do the right thing on this
system but unfortunatly 2.4.x isn't trustworthy yet

Yes, the software is buggy and should be fixed.  Do I have the power to
fix a broken commerical package that I'm locked into?  No.

The point of an OOM killer is if all hell breaks loose and you have a
choice between a locked up system, a system thats slow as hell because its
spending all its time swapping, or a system that kills the offender and
gets back to buisness.  I choose the third option.  I can't think of any
situation (either on desktop or server) where a system lockup or panic due
to OOM would be acceptible w/ 2.4.x.


On Thu, 12 Oct 2000, Matthew Hawkins wrote:

 
 Heh.. now all we need is some smart-arse to make something similar to
 apply to the _entire_ VM subsystem, and both Rik and Andrea can be happy
 ;)
 
 Seriously, am I missing something obvious or is it far simpler just to
 keel over and die if the system goes OOM?  I mean, seriously, if the
 administrator lets it get to that state then he/she/it deserves a dead
 system.  It's akin to having your car run out of petrol - you don't
 start shooting passengers because their extra load made the engine chew
 more.  You pack up your kitty and go to the nearest petrol station and
 buy more, plug it into the car then learn from the experience so this
 fringe case of it happening doesn't happen again.  I don't really see
 much difference between a car going "OOP" and a computer going OOM.
 Should we start deleting files according to some randomly-chosen
 heueristic if a filesystem goes "OOS" ?
 
 -- 
 Matt
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 

------
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)

2000-10-11 Thread Bruce A. Locke

On Thu, 12 Oct 2000, Matthew Hawkins wrote:

 Yep, for not setting appropriate resource limits.
 
 man 2 setrlimit
 
 Of course, if its a kernel bug that causes it I think you're SOL ;)

This manpage shows me functions and structs.  I'm assuming you want these
used by the offending program or the shell under which the program is
being called.  In the first case, a person might not have source to the
program and if thats the case, it doesn't help much.  And in the second
case, if the shell sets it, does it affect children of a process (aka
fork()'d)?  

Thanks for yout time...

 
 -- 
 Matt
 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)

2000-10-11 Thread Bruce A. Locke

On Wed, 11 Oct 2000, Paul Jakma wrote:

 that's why you have per process limits set. Eg, PAM makes this
 exceedingly easy with pam_limit.so - edit /etc/security/limit.conf.
 
 this prevents at least 90% of OOM situations (ie individual leaky
 processes). eg netscape will then pop-up "can not allocate memory"
 messages and stop rendering pages instead of crashing your system.

I wasn't aware PAM settings affected daemons started up during boottime
but I will check into it, thank you.

BTW, you said it works only 90%, what are the other 10% of times it
doesn't work?

 
 --paulj
 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: PERCRAID 3 drivers?

2000-09-20 Thread Bruce A. Locke


Thanks to everyone who responded.

The aacard driver patches that were in the Redhat pinstripe kernel SRPM
work fine with 2.2.17.  The machine seems pretty stable and speed is about
the same as with the binary driver.

Thanks again...

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: PERCRAID 3 drivers?

2000-09-20 Thread Bruce A. Locke


Thanks to everyone who responded.

The aacard driver patches that were in the Redhat pinstripe kernel SRPM
work fine with 2.2.17.  The machine seems pretty stable and speed is about
the same as with the binary driver.

Thanks again...

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PERCRAID 3 drivers?

2000-09-18 Thread Bruce A. Locke


As a matter of fact I already have such a kernel compiled but I need to be
there in person to make sure it doesn't blow up. :)

There were four files:

linux-2.2.16-aacraid-1.0.3.patch
linux-2.2.16-aacraid-1.0.3-paths.patch
linux-2.2.16-aacraid-1.0.4.patch
linux-2.2.16-aacraid-1.0.5.patch

The only part of the patchs that caused a reject with 2.2.17 was this:

--- linux/drivers/scsi/scsi.c.aacraid   Tue Jun 13 12:52:09 2000
+++ linux/drivers/scsi/scsi.c   Tue Jun 13 12:52:42 2000
@@ -307,6 +307,8 @@
 {"MATSHITA","PD-1","*", BLIST_FORCELUN | BLIST_SINGLELUN},
 {"iomega","jaz 1GB","J.86", BLIST_NOTQ | BLIST_NOLUN},
 {"TOSHIBA","CDROM","*", BLIST_ISROM},
+{"DELL", "PERCRAID", "*", BLIST_FORCELUN},
+{"HP", "NetRAID-4M", "*", BLIST_FORCELUN},
 {"MegaRAID", "LD", "*", BLIST_FORCELUN},
 /*
  * Must be at end of list...

Easily addable by hand.  

Though I am concerned about the KNOWNBUGS file that was in the 1.0.3 patch
but was removed by the later version patches.  It seems to indicate its a
bad idea to compile it directly in the kernel.  Is it better to compile it
as a module?

Thanks for your time...



On Mon, 18 Sep 2000, Jon Mitchell wrote:

> On Sun, Sep 17, 2000 at 09:40:18AM -0500, [EMAIL PROTECTED] wrote:
> > The aacraid driver was submitted to Alan Cox, but rejected because it has
> > too many "NTism's" in it, which are being addressed.  Please see the Red Hat
> > Linux "Pinstripe" beta kernel source RPM for the source code, or contact me
> > privately.
> 
> Or, you can get this yourself.  Evidently the source code is released.  
> 
> By going to Dell's website and downloading the kernel source rpm for
> 2.2.16-3, then installing the kernel rpm with rpm -i.  Finally go into the
> /usr/src/redhat/SOURCES directory and you will find two files with aacraid
> in the name.  These patches will apply (patch is able to make due with the
> slight line number changes) with only a couple of exceptions.  You will
> find the rejects extremely easy to fix because 3 out of 4 of them are 
> already in the kernel, only one thing actually needs to be fixed by hand.
> 
> Then make config and say yes to adaptec raid controller question.  Just
> had to do this last week, and if necessary I can make a patch and send it
> out that applies correctly to the 2.2.18pre series.
> 
> --
> Jon Mitchell
> Systems Engineer, Subject Wills & Company
> [EMAIL PROTECTED]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PERCRAID 3 drivers?

2000-09-17 Thread Bruce A. Locke


So does this mean that the driver currently in redhat's pinstripe beta
should be avoided on an production SMP system?  Is sticking with 2.2.14
perferable right now?  Anyone know how far along the adaptec guys are?

Thanks for your time... 


On Sun, 17 Sep 2000, Alan Cox wrote:

> > AFAIK, Dell wrote these drivers themselfs and they are unwilling to release
> 
> The drivers for the percraid have adaptec copyrights and have been made
> available finally but were too ugly at the moment to merge (and had some 
> obvious potentially nasty bugs like using down() on a spinlock
> 
> The adaptec guys are cleaning it up
> 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



PERCRAID 3 drivers?

2000-09-17 Thread Bruce A. Locke


Hello...

The organization I do some work for purchased a rackmount server from 
Dell with the intent of running some webconferencing software under
Linux.  The salesman we had spoken to assured us that Linux fully
supported the machine.   Yeah... Right...  :)

Now it seems I'm stuck with a PERCRAID 3 card that only has support in 
the form of a binary kernel module for kernel 2.2.14 (w/ redhat's
patches).  While the box runs fine with this kernel, I would definatly
like to upgrade the kernel to something that doesn't have so many known
flaws ;)  The machine is already in use so switching raid cards isn't much
of an option at this time.

A check of Dell's (rather horrible) support website only turns up the
binary module mentioned above. Does anyone know anything about these
PERCRAID 3 cards and if there is an opensource driver? or at least a
binary module for a newer kernel?

Thanks for your time...

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



PERCRAID 3 drivers?

2000-09-17 Thread Bruce A. Locke


Hello...

The organization I do some work for purchased a rackmount server from 
Dell with the intent of running some webconferencing software under
Linux.  The salesman we had spoken to assured us that Linux fully
supported the machine.  sarcasm Yeah... Right... /sarcasm :)

Now it seems I'm stuck with a PERCRAID 3 card that only has support in 
the form of a binary kernel module for kernel 2.2.14 (w/ redhat's
patches).  While the box runs fine with this kernel, I would definatly
like to upgrade the kernel to something that doesn't have so many known
flaws ;)  The machine is already in use so switching raid cards isn't much
of an option at this time.

A check of Dell's (rather horrible) support website only turns up the
binary module mentioned above. Does anyone know anything about these
PERCRAID 3 cards and if there is an opensource driver? or at least a
binary module for a newer kernel?

Thanks for your time...

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PERCRAID 3 drivers?

2000-09-17 Thread Bruce A. Locke


So does this mean that the driver currently in redhat's pinstripe beta
should be avoided on an production SMP system?  Is sticking with 2.2.14
perferable right now?  Anyone know how far along the adaptec guys are?

Thanks for your time... 


On Sun, 17 Sep 2000, Alan Cox wrote:

  AFAIK, Dell wrote these drivers themselfs and they are unwilling to release
 
 The drivers for the percraid have adaptec copyrights and have been made
 available finally but were too ugly at the moment to merge (and had some 
 obvious potentially nasty bugs like using down() on a spinlock
 
 The adaptec guys are cleaning it up
 

--
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Will Riel's vmpatch make it into 2.4.0-test9?

2000-09-15 Thread Bruce A. Locke


Hello...  Just a question from a user... :)

I was just wondering if Rik van Riel's VM patches might possibly be
integrated into 2.4.0-testX anytime soon?

I have been having very good experiences with Riel's latest patches in
both a desktop and light server environment.  On the desktop, Riel's
vmpatch against 2.4.0-test8 is significantly better then 2.4.0-test8's
performance.  Operation seems much faster and smoother.  In fact, I'd say
it could even be much better then whats floating around in 2.2.x now! 

Just a couple of examples on where the differences in Riel's patches
become apparent (though one might argue that they are useless
indicators)

I have recently been having difficulty with Loki Games's latest patches
for the game Unreal Tournament.  The problem is it leaks memory very
quickly.  After about two minutes of gameplay the process takes up 300+MB
of memory with an increase of around 10mb per minute.  This machine only
has 256mb of ram so you can imagine that it starts to swap at that
point.  Usually under 2.2.x it takes about 4 minutes of gameplay before
I hear my harddrive screetch and the game starts becoming
unresponsive.  With Riel's patch and 2.4.0-test8, I can sometimes play for
up to 6 minutes before the swapping makes the game unusual.  Is Riel's
patch more efficient at swapping or something?

Another example...  Start Quake3, join a game on the internet and play for
about 5 minutes.  Then quit and go right back into the game.  With Riel's
patch the harddrive is only touched for a brief second.  With 2.2.x the
harddrive is touched for a few seconds while quake3 loads the maps, etc
that were played before.  Riel's patch seems to keep things cached in
memory much more efficiently. 

And yet another example... mkisofs on my system is a dog with
2.4.0-test8.  It kills performance of other applications and makes mp3
players skip etc.  There is no skipping and "freezeups" with Riel's
patch.  Although mkisofs seems slower under 2.4.0-test8+vmpatch then stock
2.2.x, Riel's VM seems to be much "smoother".

I tried Riel's patch on a lightly loaded webserver.  Though because of the
lack of any substantial load on the system I don't think its worth
commenting on it...  Perhaps other people could comment on their
experiences?

I am looking forward to vmpatch with the planned IO tweaks and out of
memory handler...  I would hate to have to patch the VM system throughout
the 2.4.x stable tree with his patch so I'm lobbying for it to be
included. :)

Just my opinions and experiences... please feel free to flame away if I am
misunderstanding something... Thanks for your time. 

------
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Will Riel's vmpatch make it into 2.4.0-test9?

2000-09-15 Thread Bruce A. Locke


Hello...  Just a question from a user... :)

I was just wondering if Rik van Riel's VM patches might possibly be
integrated into 2.4.0-testX anytime soon?

I have been having very good experiences with Riel's latest patches in
both a desktop and light server environment.  On the desktop, Riel's
vmpatch against 2.4.0-test8 is significantly better then 2.4.0-test8's
performance.  Operation seems much faster and smoother.  In fact, I'd say
it could even be much better then whats floating around in 2.2.x now! 

Just a couple of examples on where the differences in Riel's patches
become apparent (though one might argue that they are useless
indicators)

I have recently been having difficulty with Loki Games's latest patches
for the game Unreal Tournament.  The problem is it leaks memory very
quickly.  After about two minutes of gameplay the process takes up 300+MB
of memory with an increase of around 10mb per minute.  This machine only
has 256mb of ram so you can imagine that it starts to swap at that
point.  Usually under 2.2.x it takes about 4 minutes of gameplay before
I hear my harddrive screetch and the game starts becoming
unresponsive.  With Riel's patch and 2.4.0-test8, I can sometimes play for
up to 6 minutes before the swapping makes the game unusual.  Is Riel's
patch more efficient at swapping or something?

Another example...  Start Quake3, join a game on the internet and play for
about 5 minutes.  Then quit and go right back into the game.  With Riel's
patch the harddrive is only touched for a brief second.  With 2.2.x the
harddrive is touched for a few seconds while quake3 loads the maps, etc
that were played before.  Riel's patch seems to keep things cached in
memory much more efficiently. 

And yet another example... mkisofs on my system is a dog with
2.4.0-test8.  It kills performance of other applications and makes mp3
players skip etc.  There is no skipping and "freezeups" with Riel's
patch.  Although mkisofs seems slower under 2.4.0-test8+vmpatch then stock
2.2.x, Riel's VM seems to be much "smoother".

I tried Riel's patch on a lightly loaded webserver.  Though because of the
lack of any substantial load on the system I don't think its worth
commenting on it...  Perhaps other people could comment on their
experiences?

I am looking forward to vmpatch with the planned IO tweaks and out of
memory handler...  I would hate to have to patch the VM system throughout
the 2.4.x stable tree with his patch so I'm lobbying for it to be
included. :)

Just my opinions and experiences... please feel free to flame away if I am
misunderstanding something... Thanks for your time. 

------
Bruce A. Locke
[EMAIL PROTECTED]

"The Internet views censorship as damage and routes around it"
www.eff.org  www.peacefire.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/