Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)
On Wed, 11 Oct 2000, Paul Jakma wrote: > that's why you have per process limits set. Eg, PAM makes this > exceedingly easy with pam_limit.so -> edit /etc/security/limit.conf. > > this prevents at least 90% of OOM situations (ie individual leaky > processes). eg netscape will then pop-up "can not allocate memory" > messages and stop rendering pages instead of crashing your system. I wasn't aware PAM settings affected daemons started up during boottime but I will check into it, thank you. BTW, you said it works only 90%, what are the other 10% of times it doesn't work? > > --paulj > ------ Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)
On Thu, 12 Oct 2000, Matthew Hawkins wrote: > Yep, for not setting appropriate resource limits. > > man 2 setrlimit > > Of course, if its a kernel bug that causes it I think you're SOL ;) This manpage shows me functions and structs. I'm assuming you want these used by the offending program or the shell under which the program is being called. In the first case, a person might not have source to the program and if thats the case, it doesn't help much. And in the second case, if the shell sets it, does it affect children of a process (aka fork()'d)? Thanks for yout time... > > -- > Matt > -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 handler)
Your making the deadly assumption that all applications behave themselves exactly the same all the time. Oops... netscape decided to freak out and take up all your memory... guess its the admins fault. Oops... some mod_perl script decided to freak out and an apache process decides to suck all of your CPU and MEM. Crap like this does happen. An example of this is a webboard package called "Blackboard" consisting of various mod_perl scripts, apache, and mysql. It is an educational online conferencing system being used in conjunction with many college classes and thus is quite vital to the campus. Unfortunatly its buggy as hell and the memory sucking bug didn't pop up until we were a couple weeks into classes and locked into the system. A mod_perl script freaks out, the copy of apache goes nuts, and we get a bunch of lovely out of memory related messages to the console. Its times like these that an OOM killer like Rik's would be very useful. I feel Rik's OOM backported to 2.2.x would do wonders for situation. After playing with Rik's OOM system, I know it would do the right thing on this system but unfortunatly 2.4.x isn't trustworthy yet Yes, the software is buggy and should be fixed. Do I have the power to fix a broken commerical package that I'm locked into? No. The point of an OOM killer is if all hell breaks loose and you have a choice between a locked up system, a system thats slow as hell because its spending all its time swapping, or a system that kills the offender and gets back to buisness. I choose the third option. I can't think of any situation (either on desktop or server) where a system lockup or panic due to OOM would be acceptible w/ 2.4.x. On Thu, 12 Oct 2000, Matthew Hawkins wrote: > > Heh.. now all we need is some smart-arse to make something similar to > apply to the _entire_ VM subsystem, and both Rik and Andrea can be happy > ;) > > Seriously, am I missing something obvious or is it far simpler just to > keel over and die if the system goes OOM? I mean, seriously, if the > administrator lets it get to that state then he/she/it deserves a dead > system. It's akin to having your car run out of petrol - you don't > start shooting passengers because their extra load made the engine chew > more. You pack up your kitty and go to the nearest petrol station and > buy more, plug it into the car then learn from the experience so this > fringe case of it happening doesn't happen again. I don't really see > much difference between a car going "OOP" and a computer going OOM. > Should we start deleting files according to some randomly-chosen > heueristic if a filesystem goes "OOS" ? > > -- > Matt > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ > -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)
Your making the deadly assumption that all applications behave themselves exactly the same all the time. Oops... netscape decided to freak out and take up all your memory... guess its the admins fault. Oops... some mod_perl script decided to freak out and an apache process decides to suck all of your CPU and MEM. Crap like this does happen. An example of this is a webboard package called "Blackboard" consisting of various mod_perl scripts, apache, and mysql. It is an educational online conferencing system being used in conjunction with many college classes and thus is quite vital to the campus. Unfortunatly its buggy as hell and the memory sucking bug didn't pop up until we were a couple weeks into classes and locked into the system. A mod_perl script freaks out, the copy of apache goes nuts, and we get a bunch of lovely out of memory related messages to the console. Its times like these that an OOM killer like Rik's would be very useful. I feel Rik's OOM backported to 2.2.x would do wonders for situation. After playing with Rik's OOM system, I know it would do the right thing on this system but unfortunatly 2.4.x isn't trustworthy yet Yes, the software is buggy and should be fixed. Do I have the power to fix a broken commerical package that I'm locked into? No. The point of an OOM killer is if all hell breaks loose and you have a choice between a locked up system, a system thats slow as hell because its spending all its time swapping, or a system that kills the offender and gets back to buisness. I choose the third option. I can't think of any situation (either on desktop or server) where a system lockup or panic due to OOM would be acceptible w/ 2.4.x. On Thu, 12 Oct 2000, Matthew Hawkins wrote: Heh.. now all we need is some smart-arse to make something similar to apply to the _entire_ VM subsystem, and both Rik and Andrea can be happy ;) Seriously, am I missing something obvious or is it far simpler just to keel over and die if the system goes OOM? I mean, seriously, if the administrator lets it get to that state then he/she/it deserves a dead system. It's akin to having your car run out of petrol - you don't start shooting passengers because their extra load made the engine chew more. You pack up your kitty and go to the nearest petrol station and buy more, plug it into the car then learn from the experience so this fringe case of it happening doesn't happen again. I don't really see much difference between a car going "OOP" and a computer going OOM. Should we start deleting files according to some randomly-chosen heueristic if a filesystem goes "OOS" ? -- Matt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ ------ Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)
On Thu, 12 Oct 2000, Matthew Hawkins wrote: Yep, for not setting appropriate resource limits. man 2 setrlimit Of course, if its a kernel bug that causes it I think you're SOL ;) This manpage shows me functions and structs. I'm assuming you want these used by the offending program or the shell under which the program is being called. In the first case, a person might not have source to the program and if thats the case, it doesn't help much. And in the second case, if the shell sets it, does it affect children of a process (aka fork()'d)? Thanks for yout time... -- Matt -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OOM killer API (was: [PATCH] VM fix for 2.4.0-test9 OOM handler)
On Wed, 11 Oct 2000, Paul Jakma wrote: that's why you have per process limits set. Eg, PAM makes this exceedingly easy with pam_limit.so - edit /etc/security/limit.conf. this prevents at least 90% of OOM situations (ie individual leaky processes). eg netscape will then pop-up "can not allocate memory" messages and stop rendering pages instead of crashing your system. I wasn't aware PAM settings affected daemons started up during boottime but I will check into it, thank you. BTW, you said it works only 90%, what are the other 10% of times it doesn't work? --paulj -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: PERCRAID 3 drivers?
Thanks to everyone who responded. The aacard driver patches that were in the Redhat pinstripe kernel SRPM work fine with 2.2.17. The machine seems pretty stable and speed is about the same as with the binary driver. Thanks again... -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: PERCRAID 3 drivers?
Thanks to everyone who responded. The aacard driver patches that were in the Redhat pinstripe kernel SRPM work fine with 2.2.17. The machine seems pretty stable and speed is about the same as with the binary driver. Thanks again... -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PERCRAID 3 drivers?
As a matter of fact I already have such a kernel compiled but I need to be there in person to make sure it doesn't blow up. :) There were four files: linux-2.2.16-aacraid-1.0.3.patch linux-2.2.16-aacraid-1.0.3-paths.patch linux-2.2.16-aacraid-1.0.4.patch linux-2.2.16-aacraid-1.0.5.patch The only part of the patchs that caused a reject with 2.2.17 was this: --- linux/drivers/scsi/scsi.c.aacraid Tue Jun 13 12:52:09 2000 +++ linux/drivers/scsi/scsi.c Tue Jun 13 12:52:42 2000 @@ -307,6 +307,8 @@ {"MATSHITA","PD-1","*", BLIST_FORCELUN | BLIST_SINGLELUN}, {"iomega","jaz 1GB","J.86", BLIST_NOTQ | BLIST_NOLUN}, {"TOSHIBA","CDROM","*", BLIST_ISROM}, +{"DELL", "PERCRAID", "*", BLIST_FORCELUN}, +{"HP", "NetRAID-4M", "*", BLIST_FORCELUN}, {"MegaRAID", "LD", "*", BLIST_FORCELUN}, /* * Must be at end of list... Easily addable by hand. Though I am concerned about the KNOWNBUGS file that was in the 1.0.3 patch but was removed by the later version patches. It seems to indicate its a bad idea to compile it directly in the kernel. Is it better to compile it as a module? Thanks for your time... On Mon, 18 Sep 2000, Jon Mitchell wrote: > On Sun, Sep 17, 2000 at 09:40:18AM -0500, [EMAIL PROTECTED] wrote: > > The aacraid driver was submitted to Alan Cox, but rejected because it has > > too many "NTism's" in it, which are being addressed. Please see the Red Hat > > Linux "Pinstripe" beta kernel source RPM for the source code, or contact me > > privately. > > Or, you can get this yourself. Evidently the source code is released. > > By going to Dell's website and downloading the kernel source rpm for > 2.2.16-3, then installing the kernel rpm with rpm -i. Finally go into the > /usr/src/redhat/SOURCES directory and you will find two files with aacraid > in the name. These patches will apply (patch is able to make due with the > slight line number changes) with only a couple of exceptions. You will > find the rejects extremely easy to fix because 3 out of 4 of them are > already in the kernel, only one thing actually needs to be fixed by hand. > > Then make config and say yes to adaptec raid controller question. Just > had to do this last week, and if necessary I can make a patch and send it > out that applies correctly to the 2.2.18pre series. > > -- > Jon Mitchell > Systems Engineer, Subject Wills & Company > [EMAIL PROTECTED] > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ > -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PERCRAID 3 drivers?
So does this mean that the driver currently in redhat's pinstripe beta should be avoided on an production SMP system? Is sticking with 2.2.14 perferable right now? Anyone know how far along the adaptec guys are? Thanks for your time... On Sun, 17 Sep 2000, Alan Cox wrote: > > AFAIK, Dell wrote these drivers themselfs and they are unwilling to release > > The drivers for the percraid have adaptec copyrights and have been made > available finally but were too ugly at the moment to merge (and had some > obvious potentially nasty bugs like using down() on a spinlock > > The adaptec guys are cleaning it up > -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
PERCRAID 3 drivers?
Hello... The organization I do some work for purchased a rackmount server from Dell with the intent of running some webconferencing software under Linux. The salesman we had spoken to assured us that Linux fully supported the machine. Yeah... Right... :) Now it seems I'm stuck with a PERCRAID 3 card that only has support in the form of a binary kernel module for kernel 2.2.14 (w/ redhat's patches). While the box runs fine with this kernel, I would definatly like to upgrade the kernel to something that doesn't have so many known flaws ;) The machine is already in use so switching raid cards isn't much of an option at this time. A check of Dell's (rather horrible) support website only turns up the binary module mentioned above. Does anyone know anything about these PERCRAID 3 cards and if there is an opensource driver? or at least a binary module for a newer kernel? Thanks for your time... -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
PERCRAID 3 drivers?
Hello... The organization I do some work for purchased a rackmount server from Dell with the intent of running some webconferencing software under Linux. The salesman we had spoken to assured us that Linux fully supported the machine. sarcasm Yeah... Right... /sarcasm :) Now it seems I'm stuck with a PERCRAID 3 card that only has support in the form of a binary kernel module for kernel 2.2.14 (w/ redhat's patches). While the box runs fine with this kernel, I would definatly like to upgrade the kernel to something that doesn't have so many known flaws ;) The machine is already in use so switching raid cards isn't much of an option at this time. A check of Dell's (rather horrible) support website only turns up the binary module mentioned above. Does anyone know anything about these PERCRAID 3 cards and if there is an opensource driver? or at least a binary module for a newer kernel? Thanks for your time... -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PERCRAID 3 drivers?
So does this mean that the driver currently in redhat's pinstripe beta should be avoided on an production SMP system? Is sticking with 2.2.14 perferable right now? Anyone know how far along the adaptec guys are? Thanks for your time... On Sun, 17 Sep 2000, Alan Cox wrote: AFAIK, Dell wrote these drivers themselfs and they are unwilling to release The drivers for the percraid have adaptec copyrights and have been made available finally but were too ugly at the moment to merge (and had some obvious potentially nasty bugs like using down() on a spinlock The adaptec guys are cleaning it up -- Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Will Riel's vmpatch make it into 2.4.0-test9?
Hello... Just a question from a user... :) I was just wondering if Rik van Riel's VM patches might possibly be integrated into 2.4.0-testX anytime soon? I have been having very good experiences with Riel's latest patches in both a desktop and light server environment. On the desktop, Riel's vmpatch against 2.4.0-test8 is significantly better then 2.4.0-test8's performance. Operation seems much faster and smoother. In fact, I'd say it could even be much better then whats floating around in 2.2.x now! Just a couple of examples on where the differences in Riel's patches become apparent (though one might argue that they are useless indicators) I have recently been having difficulty with Loki Games's latest patches for the game Unreal Tournament. The problem is it leaks memory very quickly. After about two minutes of gameplay the process takes up 300+MB of memory with an increase of around 10mb per minute. This machine only has 256mb of ram so you can imagine that it starts to swap at that point. Usually under 2.2.x it takes about 4 minutes of gameplay before I hear my harddrive screetch and the game starts becoming unresponsive. With Riel's patch and 2.4.0-test8, I can sometimes play for up to 6 minutes before the swapping makes the game unusual. Is Riel's patch more efficient at swapping or something? Another example... Start Quake3, join a game on the internet and play for about 5 minutes. Then quit and go right back into the game. With Riel's patch the harddrive is only touched for a brief second. With 2.2.x the harddrive is touched for a few seconds while quake3 loads the maps, etc that were played before. Riel's patch seems to keep things cached in memory much more efficiently. And yet another example... mkisofs on my system is a dog with 2.4.0-test8. It kills performance of other applications and makes mp3 players skip etc. There is no skipping and "freezeups" with Riel's patch. Although mkisofs seems slower under 2.4.0-test8+vmpatch then stock 2.2.x, Riel's VM seems to be much "smoother". I tried Riel's patch on a lightly loaded webserver. Though because of the lack of any substantial load on the system I don't think its worth commenting on it... Perhaps other people could comment on their experiences? I am looking forward to vmpatch with the planned IO tweaks and out of memory handler... I would hate to have to patch the VM system throughout the 2.4.x stable tree with his patch so I'm lobbying for it to be included. :) Just my opinions and experiences... please feel free to flame away if I am misunderstanding something... Thanks for your time. ------ Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Will Riel's vmpatch make it into 2.4.0-test9?
Hello... Just a question from a user... :) I was just wondering if Rik van Riel's VM patches might possibly be integrated into 2.4.0-testX anytime soon? I have been having very good experiences with Riel's latest patches in both a desktop and light server environment. On the desktop, Riel's vmpatch against 2.4.0-test8 is significantly better then 2.4.0-test8's performance. Operation seems much faster and smoother. In fact, I'd say it could even be much better then whats floating around in 2.2.x now! Just a couple of examples on where the differences in Riel's patches become apparent (though one might argue that they are useless indicators) I have recently been having difficulty with Loki Games's latest patches for the game Unreal Tournament. The problem is it leaks memory very quickly. After about two minutes of gameplay the process takes up 300+MB of memory with an increase of around 10mb per minute. This machine only has 256mb of ram so you can imagine that it starts to swap at that point. Usually under 2.2.x it takes about 4 minutes of gameplay before I hear my harddrive screetch and the game starts becoming unresponsive. With Riel's patch and 2.4.0-test8, I can sometimes play for up to 6 minutes before the swapping makes the game unusual. Is Riel's patch more efficient at swapping or something? Another example... Start Quake3, join a game on the internet and play for about 5 minutes. Then quit and go right back into the game. With Riel's patch the harddrive is only touched for a brief second. With 2.2.x the harddrive is touched for a few seconds while quake3 loads the maps, etc that were played before. Riel's patch seems to keep things cached in memory much more efficiently. And yet another example... mkisofs on my system is a dog with 2.4.0-test8. It kills performance of other applications and makes mp3 players skip etc. There is no skipping and "freezeups" with Riel's patch. Although mkisofs seems slower under 2.4.0-test8+vmpatch then stock 2.2.x, Riel's VM seems to be much "smoother". I tried Riel's patch on a lightly loaded webserver. Though because of the lack of any substantial load on the system I don't think its worth commenting on it... Perhaps other people could comment on their experiences? I am looking forward to vmpatch with the planned IO tweaks and out of memory handler... I would hate to have to patch the VM system throughout the 2.4.x stable tree with his patch so I'm lobbying for it to be included. :) Just my opinions and experiences... please feel free to flame away if I am misunderstanding something... Thanks for your time. ------ Bruce A. Locke [EMAIL PROTECTED] "The Internet views censorship as damage and routes around it" www.eff.org www.peacefire.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/