Re: close-on-exec alternatives (was: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?)
On Tue, Jun 02, 2009 at 08:58:50PM +0200, Bernd Walter wrote: > It is still very interesting, because I currently have a similar problem > and wasn't aware of getdtablesize(); Note that many (other) systems provide a much simpler and efficient function for the above, closefrom(3). Joerg ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
close-on-exec alternatives (was: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?)
On Fri, May 29, 2009 at 12:31:37PM -0700, Alfred Perlstein wrote: > * Dag-Erling Sm??rgrav [090529 02:49] wrote: > > Alfred Perlstein writes: > > > Dag-Erling Sm??rgrav writes: > > > > Usually, what you see is closer to this: > > > > > > > > if ((pid = fork()) == 0) { > > > > for (int fd = 3; fd < getdtablesize(); ++fd) > > > > (void)close(fd); > > > > execve(path, argv, envp); > > > > _exit(1); > > > > } > > > > > > I'm probably missing something, but couldn't you iterate > > > in the parent setting the close-on-exec flag then vfork? > > > > This is an example, Alfred. Like most examples, it is greatly > > simplified. I invite you to peruse the source to find real-world > > instances of non-trivial fork() / execve() usage. > > It wasn't meant to critisize, just ask a question for the specific > instance because it made me curious. I know how bad it can be with > vfork as I observed a few fixes involving mistaken use of vfork at > another job. It is still very interesting, because I currently have a similar problem and wasn't aware of getdtablesize(); A threaded application which needs to call an external script of unknown runtime. I don't have all descriptors under my knowledge, because external libs might open them. I also believe there could be a race between retrieving the descriptor and setting close-on-exec. The only solution which I found so far is using rfork with RFCFDG. If I undestand RFCFDG correctly the child process has no descriptors at all. This is Ok for me, since I don't need to inherit some. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
* Dag-Erling Sm??rgrav [090529 02:49] wrote: > Alfred Perlstein writes: > > Dag-Erling Sm??rgrav writes: > > > Usually, what you see is closer to this: > > > > > > if ((pid = fork()) == 0) { > > > for (int fd = 3; fd < getdtablesize(); ++fd) > > > (void)close(fd); > > > execve(path, argv, envp); > > > _exit(1); > > > } > > > > I'm probably missing something, but couldn't you iterate > > in the parent setting the close-on-exec flag then vfork? > > This is an example, Alfred. Like most examples, it is greatly > simplified. I invite you to peruse the source to find real-world > instances of non-trivial fork() / execve() usage. It wasn't meant to critisize, just ask a question for the specific instance because it made me curious. I know how bad it can be with vfork as I observed a few fixes involving mistaken use of vfork at another job. So yes, there's more than one way to skin a cat for this particular example... but in practice using vfork()+exec() is hard to get right? -- - Alfred Perlstein ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Alfred Perlstein writes: > Dag-Erling Smørgrav writes: > > Usually, what you see is closer to this: > > > > if ((pid = fork()) == 0) { > > for (int fd = 3; fd < getdtablesize(); ++fd) > > (void)close(fd); > > execve(path, argv, envp); > > _exit(1); > > } > > I'm probably missing something, but couldn't you iterate > in the parent setting the close-on-exec flag then vfork? This is an example, Alfred. Like most examples, it is greatly simplified. I invite you to peruse the source to find real-world instances of non-trivial fork() / execve() usage. DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
* Dag-Erling Sm??rgrav [090527 06:10] wrote: > Yuri writes: > > I don't have strong opinion for or against "memory overcommit". But I > > can imagine one could argue that fork with intent of exec is a faulty > > scenario that is a relict from the past. It can be replaced by some > > atomic method that would spawn the child without ovecommitting. > > You will very rarely see something like this: > > if ((pid = fork()) == 0) { > execve(path, argv, envp); > _exit(1); > } > > Usually, what you see is closer to this: > > if ((pid = fork()) == 0) { > for (int fd = 3; fd < getdtablesize(); ++fd) > (void)close(fd); > execve(path, argv, envp); > _exit(1); > } I'm probably missing something, but couldn't you iterate in the parent setting the close-on-exec flag then vfork? I guess that wouldn't work for threads AND you'd have to undo it after the fork if you didn't want to retain that behavior? thanks, -Alfred ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Yuri writes: > I don't have strong opinion for or against "memory overcommit". But I > can imagine one could argue that fork with intent of exec is a faulty > scenario that is a relict from the past. It can be replaced by some > atomic method that would spawn the child without ovecommitting. You will very rarely see something like this: if ((pid = fork()) == 0) { execve(path, argv, envp); _exit(1); } Usually, what you see is closer to this: if ((pid = fork()) == 0) { for (int fd = 3; fd < getdtablesize(); ++fd) (void)close(fd); execve(path, argv, envp); _exit(1); } ...with infinite variation depending on whether the parent needs to communicate with the child, whether the child needs std{in,out,err} at all, etc. For the trivial case, there is always vfork(), which does not duplicate the address space, and blocks the parent until the child has execve()d. This allows you to pull cute tricks like this: volatile int error = 0; if ((pid = vfork()) == 0) { error = execve(path, argv, envp); _exit(1); } if (pid == -1 || error != 0) perror("Failed to start subprocess"); DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
:On Thursday 21 May 2009 23:37:20 Nate Eldredge wrote: :> Of course all these problems are solved, under any policy, by having more :> memory or swap. =A0But overcommit allows you to do more with less. : :Or to put it another way, 90% of the problems that could be solved by havin= :g=20 :more memory can also be solved by pretending you have more memory and hopin= :g=20 :no-one calls your bluff. : :Jonathan It's a bit more complicated then that. Most of the memory duplication (or lack of) which occurs after a fork() is deterministic. It's not a matter of pretending, it's a matter of practical application. For example, when sendmail fork()'s a deterministic subset of the duplicated writable memory will never be modified by the child. Ever. This is what overcommit takes advantage of. Nearly every program which fork()'s has a significant level of duplication of writable memory which deterministically reduces the set of pages which will ever need to be demand-copied. The OS cannot predict which pages these will be, but the effect from a whole-systems point of view is well known and deterministic. Similarly the OS cannot really determine who is responsible for running the system out of memory. Is it that big whopping program X or is it the 200 fork()'ed copies of server Y? Only a human being can really make the determination. This is also why turning off overcommit can easily lead to the system failing even if it is nowhere near running out of actual memory. In otherwords, the only real practical result of turning off overcommit is to make a system less stable and less able to deal with exceptional conditions. Systems which cannot afford to run out of memory are built from the ground-up to not allocate an unbounded amount of memory in the first place. There's no other way to do it. The Mars Rover is a good example of that. In such systems actually running out of memory is often considered to be a fatal fault. -Matt ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Fri, May 22, 2009 at 02:42:06AM +0100, Thomas Hurst wrote: > * Nate Eldredge (neldre...@math.ucsd.edu) wrote: > > > There may be a way to enable the conservative behavior; I know Linux > > has an option to do this, but am not sure about FreeBSD. > > I seem to remember a patch to disable overcommit. Here we go: > > http://people.freebsd.org/~kib/overcommit/ Latest version is at http://people.freebsd.org/~kib/overcommit/vm_overcommit.22.patch applicable to the today CURRENT. pgpK26bTkMArT.pgp Description: PGP signature
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
* Yuri [090521 10:52] wrote: > Nate Eldredge wrote: > >Suppose we run this program on a machine with just over 1 GB of > >memory. The fork() should give the child a private "copy" of the 1 GB > >buffer, by setting it to copy-on-write. In principle, after the > >fork(), the child might want to rewrite the buffer, which would > >require an additional 1GB to be available for the child's copy. So > >under a conservative allocation policy, the kernel would have to > >reserve that extra 1 GB at the time of the fork(). Since it can't do > >that on our hypothetical 1+ GB machine, the fork() must fail, and the > >program won't work. > > I don't have strong opinion for or against "memory overcommit". But I > can imagine one could argue that fork with intent of exec is a faulty > scenario that is a relict from the past. It can be replaced by some > atomic method that would spawn the child without ovecommitting. vfork, however that's not sufficient for many scenarios. > Are there any other than fork (and mmap/sbrk) situations that would > overcommit? sysv shm? maybe more. -- - Alfred Perlstein ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thursday 21 May 2009 23:37:20 Nate Eldredge wrote: > Of course all these problems are solved, under any policy, by having more > memory or swap. But overcommit allows you to do more with less. Or to put it another way, 90% of the problems that could be solved by having more memory can also be solved by pretending you have more memory and hoping no-one calls your bluff. Jonathan ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Fri, 22 May 2009, Yuri wrote: > Nate Eldredge wrote: > > Suppose we run this program on a machine with just over 1 GB of > > memory. The fork() should give the child a private "copy" of the 1 > > GB buffer, by setting it to copy-on-write. In principle, after the > > fork(), the child might want to rewrite the buffer, which would > > require an additional 1GB to be available for the child's copy. So > > under a conservative allocation policy, the kernel would have to > > reserve that extra 1 GB at the time of the fork(). Since it can't > > do that on our hypothetical 1+ GB machine, the fork() must fail, > > and the program won't work. > > I don't have strong opinion for or against "memory overcommit". But I > can imagine one could argue that fork with intent of exec is a faulty > scenario that is a relict from the past. It can be replaced by some > atomic method that would spawn the child without ovecommitting. If all you are going to do is call execve() then use vfork(). That explicitly does not copy the parent's address space. Also your example is odd, if you have a program using 1Gb (RAM + swap) and you want to start another (in any way) then that is going to be impossible. If you had a 750Mb process that forked and the child only modified 250Mb you'd be all right because the other pages would be copies. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
There is no such thing as a graceful way to deal with running out of memory. What is a program supposed to do? Even if it gracefully exits it still fails to perform the function for which it was designed. If such a program is used in a script then the script fails as well. Even the best systems (e.g. space shuttle, mars rover, airplane control systems) which try to deal with unexpected situations still have to have the final option, that being a complete reset. And even a complete reset is no guarantee of recovery (as one of the original airbus accidents during an air-show revealed when the control systems got into a reset loop and the pilot could not regain control of the plane). The most robust systems do things like multiple independant built-to-spec programs and a voting system which require 10 times the man power to code and test, something you will likely never see in the open-source world or even in most of the commercial application world. In fact, it is nearly impossible to write code which gracefully fails even if the intent is to gracefully fail (and even if one can even figure out what a workable graceful failure path would even be). You would have to build code paths to deal with the failure conditions, significantly increasing the size of the code base, and you would have to test every possible failure combination to exercise those code paths to make sure they actually work as expected. If you don't then the code paths designed to deal with the failure will themselves likely be full of bugs and make the problem worse. People who try to program this way but don't have the massive resources required often wind up with seriously bloated and buggy code. So if the system runs out of memory (meaning physical memory + all available swap), having a random subset of programs of any size start to fail will rapidly result in a completely unusable system and only a reboot will get it back into shape. At least until it runs out of memory again. -- The best one can do is make the failures more deterministic. Killing the largest program is one such mechanic. Knowing how the system will react makes it easier to restore the system without necessarily rebooting it. Of course there might have to be exceptions as you don't want your X server to be the program chosen. Generally, though, having some sort of deterministic progression is going to be far better then having half a dozen random programs which happen to be trying to allocate memory suddenly get an unexpected memory allocation failure. Also, it's really a non-problem. Simply configure a lot of swap... like 8G or 16G if you really care. Or 100G. Then you *DO* get a graceful failure which gives you time to figure out what is going on and fix it. The graceful failure is that the system starts to page to swap more and more heavily, getting slower and slower in the process, but doesn't actually have to kill anything for minutes to hours depending on the failure condition. It's a lot easier to write code which reacts to a system which is operating at less then ideal efficiency then it is to write code which reacts to the failure of a core function (that of allocating memory). One could even monitor swap use as ring the alarm bells if it goes above a certain point. Overcommit has never been the problem. The problem is there is no way a large system can gracefully deal with running out of memory, overcommit or not. -Matt ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
* Nate Eldredge (neldre...@math.ucsd.edu) wrote: > There may be a way to enable the conservative behavior; I know Linux > has an option to do this, but am not sure about FreeBSD. I seem to remember a patch to disable overcommit. Here we go: http://people.freebsd.org/~kib/overcommit/ -- Thomas 'Freaky' Hurst http://hur.st/ ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thu, 21 May 2009, Yuri wrote: Nate Eldredge wrote: Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private "copy" of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. I don't have strong opinion for or against "memory overcommit". But I can imagine one could argue that fork with intent of exec is a faulty scenario that is a relict from the past. It can be replaced by some atomic method that would spawn the child without ovecommitting. I would say rather it's a centerpiece of Unix design, with an unfortunate consequence. Actually, historically this would have been much more of a problem than at present, since early Unix systems had much less memory, no copy-on-write, and no virtual memory (this came in with BSD, it appears; it's before my time.) The modern "atomic" method we have these days is posix_spawn, which has a pretty complicated interface if you want to use pipes or anything. It exists mostly for the benefit of systems whose hardware is too primitive to be able to fork() in a reasonable manner. The old way to avoid the problem of needing this extra memory temporarily was to use vfork(), but this has always been a hack with a number of problems. IMHO neither of these is preferable in principle to fork/exec. Note another good example is a large process that forks, but the child rather than exec'ing performs some simple task that writes to very little of its "copied" address space. Apache does this, as Bernd mentioned. This also is greatly helped by having overcommit, but can't be circumvented by replacing fork() with something else. If it really doesn't need to modify any of its shared address space, a thread can sometimes be used instead of a forked subprocess, but this has issues of its own. Of course all these problems are solved, under any policy, by having more memory or swap. But overcommit allows you to do more with less. Are there any other than fork (and mmap/sbrk) situations that would overcommit? Perhaps, but I can't think of good examples offhand. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thu, May 21, 2009 at 10:52:26AM -0700, Yuri wrote: > Nate Eldredge wrote: > >Suppose we run this program on a machine with just over 1 GB of > >memory. The fork() should give the child a private "copy" of the 1 GB > >buffer, by setting it to copy-on-write. In principle, after the > >fork(), the child might want to rewrite the buffer, which would > >require an additional 1GB to be available for the child's copy. So > >under a conservative allocation policy, the kernel would have to > >reserve that extra 1 GB at the time of the fork(). Since it can't do > >that on our hypothetical 1+ GB machine, the fork() must fail, and the > >program won't work. > > I don't have strong opinion for or against "memory overcommit". But I > can imagine one could argue that fork with intent of exec is a faulty > scenario that is a relict from the past. It can be replaced by some > atomic method that would spawn the child without ovecommitting. > > Are there any other than fork (and mmap/sbrk) situations that would > overcommit? If your system has enough virtual memory for working without overcommitment it will run fine with overcommitment as well. If you don't have enough memory it can do much more with overcommitment. A simple apache process needing 1G and serving 1000 Clients would need 1TB swap without ever touching it. Same for small embedded systems with limited swap. So the requirement of overcomittment is not just a requirement of old days. Overcomittment is even used more and more. An example are snapshots, which are popular these days can lead to space failure in case you rewrite a file with new data without growing its length. The old sparse file concept is also one of them, which can confuse unaware software. And then we have geom_virstore since a while. Many modern databases do it as well. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Nate Eldredge wrote: Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private "copy" of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. I don't have strong opinion for or against "memory overcommit". But I can imagine one could argue that fork with intent of exec is a faulty scenario that is a relict from the past. It can be replaced by some atomic method that would spawn the child without ovecommitting. Are there any other than fork (and mmap/sbrk) situations that would overcommit? Yuri ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Thu, 21 May 2009, per...@pluto.rain.com wrote: Nate Eldredge wrote: With overcommit, we pretend to give the child a writable private copy of the buffer, in hopes that it won't actually use more of it than we can fulfill with physical memory. I am about 99% sure that the issue involves virtual memory, not physical, at least in the fork/exec case. The incidence of such events under any particular system load scenario can be reduced or eliminated simply by adding swap space. True. When I said "a system with 1GB of memory", I should have said "a system with 1 GB of physical memory + swap". -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Nate Eldredge wrote: > For instance, consider the following program. > this happens most of the time with fork() ... It may be worthwhile to point out that one extremely common case is the shell itself. Even /bin/sh is large; csh (the default FreeBSD shell) is quite a bit larger and bash larger yet. The case of "big program forks, and the child process execs a small program" arises almost every time a shell command (other than a built-in) is executed. > With overcommit, we pretend to give the child a writable private > copy of the buffer, in hopes that it won't actually use more of it > than we can fulfill with physical memory. I am about 99% sure that the issue involves virtual memory, not physical, at least in the fork/exec case. The incidence of such events under any particular system load scenario can be reduced or eliminated simply by adding swap space. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
+--- Yuri, 2009-05-20 --- | Seems like failing system calls (mmap and sbrk) that allocate memory is more | graceful and would allow the program to at least issue the reasonable | error message. | And more intelligent programs would be able to reduce used memory | instead of just dying. Hi! You can set memory limit to achieve your goal: tcsh% limit vmemoryuse 20M In this case, malloc(10) will return 0. Ilya. | | Yuri | | ___ | freebsd-hackers@freebsd.org mailing list | http://lists.freebsd.org/mailman/listinfo/freebsd-hackers | To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" | +- ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Because the kernel is lazy!! You can google for "lazy algorithm", or find an OS internals book and read about the advantages of doing it this way... Rayson On Thu, May 21, 2009 at 1:32 AM, Yuri wrote: > Seems like failing system calls (mmap and sbrk) that allocate memory is more > graceful and would allow the program to at least issue the reasonable error > message. > And more intelligent programs would be able to reduce used memory instead of > just dying. > > Yuri > > ___ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" > ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
On Wed, 20 May 2009, Yuri wrote: Seems like failing system calls (mmap and sbrk) that allocate memory is more graceful and would allow the program to at least issue the reasonable error message. And more intelligent programs would be able to reduce used memory instead of just dying. It's a feature, called "memory overcommit". It has a variety of pros and cons, and is somewhat controversial. One advantage is that programs often allocate memory (in various ways) that they will never use, which under a conservative policy would result in that memory being wasted, or programs failing unnecessarily. With overcommit, you sometimes allocate more memory than you have, on the assumption that some of it will not actually be needed. Although memory allocated by mmap and sbrk usually does get used in fairly short order, there are other ways of allocating memory that are easy to overlook, and which may "allocate" memory that you don't actually intend to use. Probably the best example is fork(). For instance, consider the following program. #define SIZE 10 /* 1 GB */ int main(void) { char *buf = malloc(SIZE); /* 1 GB */ memset(buf, 'x', SIZE); /* touch the buffer */ pid_t pid = fork(); if (pid == 0) { execlp("true", "true", (char *)NULL); perror("true"); _exit(1); } else if (pid > 0) { for (;;); /* do work */ } else { perror("fork"); exit(1); } return 0; } Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private "copy" of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. However, in fact that memory is not going to be used, because the child is going to exec() right away, which will free the child's "copy". Indeed, this happens most of the time with fork() (but of course the kernel can't know when it will or won't.) With overcommit, we pretend to give the child a writable private copy of the buffer, in hopes that it won't actually use more of it than we can fulfill with physical memory. If it doesn't use it, all is well; if it does use it, then disaster occurs and we have to start killing things. So the advantage is you can run programs like the one above on machines that technically don't have enough memory to do so. The disadvantage, of course, is that if someone calls the bluff, then we kill random processes. However, this is not all that much worse than failing allocations: although programs can in theory handle failed allocations and respond accordingly, in practice they don't do so and just quit anyway. So in real life, both cases result in disaster when memory "runs out"; with overcommit, the disaster is a little less predictable but happens much less often. If you google for "memory overcommit" you will see lots of opinions and debate about this feature on various operating systems. There may be a way to enable the conservative behavior; I know Linux has an option to do this, but am not sure about FreeBSD. This might be useful if you are paranoid, or run programs that you know will gracefully handle running out of memory. IMHO for general use it is better to have overcommit, but I know there are those who disagree. -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Seems like failing system calls (mmap and sbrk) that allocate memory is more graceful and would allow the program to at least issue the reasonable error message. And more intelligent programs would be able to reduce used memory instead of just dying. Yuri ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"