Re: 2.6.23.8: OOM killer kills wrong jobs
> You will *probably* get stable 16GB with the vendor tuned enterprise > kernels (RHEL, CentOS etc), That's sounds "a little" relief. Thesis 1,2,3 has 16GB memory. Aries has 12G. Tony Wang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
(removing Alan Cox from the Cc: list; He does not need to be involved in the details of our discussions of local systems ...) On Mon, 17 Dec 2007 [EMAIL PROTECTED] wrote: > ... Thesis 1,2,3 has 16GB memory. Aries has 12G. Note that the Theses and Aries are Xeon systems, which are 32-bit systems in the first place, so the problem described is likely not applicable to these. I fully expect that the problem being encountered on Baby Alcor (and Bonnie?) is specifically because of the large memory configuration, on a 64-bit system, but running a 32-bit OS configuration. At least that's how I understood the explanation. -- -- Sylvain Robitaille [EMAIL PROTECTED] Systems and Network analyst Concordia University Instructional & Information TechnologyMontreal, Quebec, Canada -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
On Mon, 17 Dec 2007 10:44:05 -0500 (EST) [EMAIL PROTECTED] wrote: > > You will *probably* get stable 16GB with the vendor tuned enterprise > > kernels (RHEL, CentOS etc), > > That's sounds "a little" relief. Thesis 1,2,3 has 16GB memory. Aries has 12G. If you can run a 64bit kernel, it will save you an inordinate amount of pain on a box with > 4GB of RAM. The fact > 4GB is possible on such a box in 32bit doesn't make it a good idea. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
> ...but I've run into a situation in which a system on which I *have* set > no overcommit is being blasted by the OOM killer anyway. Looks like the kernel is eating all the resources needed. >Linux babyalcor 2.6.23.1 #1 SMP Fri Oct 26 15:35:18 EDT 2007 \ > i686 Dual Core AMD Opteron(tm) Processor 280 AuthenticAMD GNU/Linux 32bit kernel, 16GB of RAM. No suprise I'm afraid. Handling 16GB on a 32bit kernel, which has to manage it all through a small addressible memory window is right on the limit of what the standard kernel will handle (8GB is probably as high as I would go). The no overcommit code ensures that user space doesn't overcommit, but the kernel can get itself short of low memory resources on a big box with 32bit kernels very easily. (In 64bit mode the CPU can address all the memory directly so the problem vanishes). You will *probably* get stable 16GB with the vendor tuned enterprise kernels (RHEL, CentOS etc), or run a 64bit kernel and then the kernel isn't trying the software equivalent of managing a filing cabinet through the keyhole. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
...but I've run into a situation in which a system on which I *have* set no overcommit is being blasted by the OOM killer anyway. Looks like the kernel is eating all the resources needed. Linux babyalcor 2.6.23.1 #1 SMP Fri Oct 26 15:35:18 EDT 2007 \ i686 Dual Core AMD Opteron(tm) Processor 280 AuthenticAMD GNU/Linux 32bit kernel, 16GB of RAM. No suprise I'm afraid. Handling 16GB on a 32bit kernel, which has to manage it all through a small addressible memory window is right on the limit of what the standard kernel will handle (8GB is probably as high as I would go). The no overcommit code ensures that user space doesn't overcommit, but the kernel can get itself short of low memory resources on a big box with 32bit kernels very easily. (In 64bit mode the CPU can address all the memory directly so the problem vanishes). You will *probably* get stable 16GB with the vendor tuned enterprise kernels (RHEL, CentOS etc), or run a 64bit kernel and then the kernel isn't trying the software equivalent of managing a filing cabinet through the keyhole. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
On Mon, 17 Dec 2007 10:44:05 -0500 (EST) [EMAIL PROTECTED] wrote: You will *probably* get stable 16GB with the vendor tuned enterprise kernels (RHEL, CentOS etc), That's sounds a little relief. Thesis 1,2,3 has 16GB memory. Aries has 12G. If you can run a 64bit kernel, it will save you an inordinate amount of pain on a box with 4GB of RAM. The fact 4GB is possible on such a box in 32bit doesn't make it a good idea. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
(removing Alan Cox from the Cc: list; He does not need to be involved in the details of our discussions of local systems ...) On Mon, 17 Dec 2007 [EMAIL PROTECTED] wrote: ... Thesis 1,2,3 has 16GB memory. Aries has 12G. Note that the Theses and Aries are Xeon systems, which are 32-bit systems in the first place, so the problem described is likely not applicable to these. I fully expect that the problem being encountered on Baby Alcor (and Bonnie?) is specifically because of the large memory configuration, on a 64-bit system, but running a 32-bit OS configuration. At least that's how I understood the explanation. -- -- Sylvain Robitaille [EMAIL PROTECTED] Systems and Network analyst Concordia University Instructional Information TechnologyMontreal, Quebec, Canada -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
You will *probably* get stable 16GB with the vendor tuned enterprise kernels (RHEL, CentOS etc), That's sounds a little relief. Thesis 1,2,3 has 16GB memory. Aries has 12G. Tony Wang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
On Fri, 07 Dec 2007 10:25:23 +0100 Martin MOKREJŠ <[EMAIL PROTECTED]> wrote: > Hi, > first of all, sorry for not being up to date with how the OOM killer > works. I think there used to be a kernel config option to disable > OOM killer and instead kill the process which actually asks for the > memory and supposedly caused the memory lack. That is what I would > like to have on my system. I a have a 1GB RAM laptop and use t-coffee > software from > http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html > to do some science. ;) The OOM killer triggers where there is no way to fulfill a page request. Something has to go and there is no real notion of "right" or "wrong" process at that point. You can either set no overcommit in which case you'll get failed malloc and similar rather than allow overcommit, or you can set the OOM priority of tasks yourself so that your specific app of choice always dies first. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
re: 2.6.23.8: OOM killer kills wrong jobs
Marting Mokreja wrote: > first of all, sorry for not being up to date with how the OOM killer > works. I think there used to be a kernel config option to disable > OOM killer and instead kill the process which actually asks for the > memory and supposedly caused the memory lack. That is what I would > like to have on my system. I a have a 1GB RAM laptop You probably just need to add more swap space on your system, Any time the OOM killer fires, something's wrong with the system, and it's more productive to deal with that than to wish for a more accurate OOM killer; see http://lwn.net/Articles/111408/ When I was working at a company that used embedded Linux, I eventually figured this out, and patched the kernel to panic on OOM conditions; that gave users the right incentive to avoid configuring jobs that caused the system to run out of memory. - Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23.8: OOM killer kills wrong jobs
On Fri, 07 Dec 2007 10:25:23 +0100 Martin MOKREJŠ [EMAIL PROTECTED] wrote: Hi, first of all, sorry for not being up to date with how the OOM killer works. I think there used to be a kernel config option to disable OOM killer and instead kill the process which actually asks for the memory and supposedly caused the memory lack. That is what I would like to have on my system. I a have a 1GB RAM laptop and use t-coffee software from http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html to do some science. ;) The OOM killer triggers where there is no way to fulfill a page request. Something has to go and there is no real notion of right or wrong process at that point. You can either set no overcommit in which case you'll get failed malloc and similar rather than allow overcommit, or you can set the OOM priority of tasks yourself so that your specific app of choice always dies first. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
re: 2.6.23.8: OOM killer kills wrong jobs
Marting Mokreja wrote: first of all, sorry for not being up to date with how the OOM killer works. I think there used to be a kernel config option to disable OOM killer and instead kill the process which actually asks for the memory and supposedly caused the memory lack. That is what I would like to have on my system. I a have a 1GB RAM laptop You probably just need to add more swap space on your system, Any time the OOM killer fires, something's wrong with the system, and it's more productive to deal with that than to wish for a more accurate OOM killer; see http://lwn.net/Articles/111408/ When I was working at a company that used embedded Linux, I eventually figured this out, and patched the kernel to panic on OOM conditions; that gave users the right incentive to avoid configuring jobs that caused the system to run out of memory. - Dan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/