Re: Replacement for grep(1) (part 2)
: :It results sometimes in out of swap, too. : :> Inetd is rate-limited by default nowadays, so this really doesn't apply. : :It really does apply. Inetd limits incoming connections per minute, not per :second. It is possible to use minute limit in a few seconds and cause a high :load. Sendmail is worse than inetd; it cannot limit incoming rate on : :Netch You can specify a maximum fork limit for inetd on a per-service basis. You are a year or two too late on these things. A great many improvements have been made to programs like sendmail and inetd explicitly to deal with overload situations. Web servers too. These were fairly simple changes as well. For sendmail it was as simple as making MaxDaemonChildren apply to queue runs - I submitted that one to Eric Allman two years ago and it's been a part of sendmail since then. For inetd it is the -c, -C, and -R options (which can be specified on a per-service basis as well). Dima and I added the -R option back in 1997 specifically to help with DOS attacks. Sendmail is not an issue when properly configured. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
:> machine. In this case the overcommit that can occur is with I/O, not :> swap. As a general performance rule, you have to set MaxDaemonChildren :> and MaxArticleSize to prevent the overcommit from occuring. This is a :> function of sendmail, not a function of the kernel. : :Sigh. ((c)you) Sendmail can overcommit a machine with right set of :MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen :such situations. MaxDaemonChildren limits only number of main processes for :incoming connections (plus queue run processes). For each connection, after :"main from:" and until accepting message, server process for incoming :connection forks child which accepts recipient list and letter body. After :message accepting, that child can fork delivery process. A queue run process :with "O ForkEachJob=true" option, which is default, can create a delivery :process for each queue job (in my practice, queue of more than 1000 jobs is :-- :Netch Actually this isn't true. QueueLA & RefuseLA tend to be useless options with sendmail. MaxDaemonChildren, on the otherhand, tends to be a very useful option. By running the daemon and the queue separately, and putting the daemon in queue-only mode, sendmail has virtually no chance of taking down the machine. Example (assuming a box w/256MB of ram): sendmail -bd -O MaxDaemonChildren=130 -O DeliveryMode=queue sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=30m sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=2h This is what we do at BEST. Once we began doing things this this way, our three (continuously loaded) frontend mail machines never bogged down ever again. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
: :It results sometimes in out of swap, too. : :> Inetd is rate-limited by default nowadays, so this really doesn't apply. : :It really does apply. Inetd limits incoming connections per minute, not per :second. It is possible to use minute limit in a few seconds and cause a high :load. Sendmail is worse than inetd; it cannot limit incoming rate on : :Netch You can specify a maximum fork limit for inetd on a per-service basis. You are a year or two too late on these things. A great many improvements have been made to programs like sendmail and inetd explicitly to deal with overload situations. Web servers too. These were fairly simple changes as well. For sendmail it was as simple as making MaxDaemonChildren apply to queue runs - I submitted that one to Eric Allman two years ago and it's been a part of sendmail since then. For inetd it is the -c, -C, and -R options (which can be specified on a per-service basis as well). Dima and I added the -R option back in 1997 specifically to help with DOS attacks. Sendmail is not an issue when properly configured. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
:> machine. In this case the overcommit that can occur is with I/O, not :> swap. As a general performance rule, you have to set MaxDaemonChildren :> and MaxArticleSize to prevent the overcommit from occuring. This is a :> function of sendmail, not a function of the kernel. : :Sigh. ((c)you) Sendmail can overcommit a machine with right set of :MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen :such situations. MaxDaemonChildren limits only number of main processes for :incoming connections (plus queue run processes). For each connection, after :"main from:" and until accepting message, server process for incoming :connection forks child which accepts recipient list and letter body. After :message accepting, that child can fork delivery process. A queue run process :with "O ForkEachJob=true" option, which is default, can create a delivery :process for each queue job (in my practice, queue of more than 1000 jobs is :-- :Netch Actually this isn't true. QueueLA & RefuseLA tend to be useless options with sendmail. MaxDaemonChildren, on the otherhand, tends to be a very useful option. By running the daemon and the queue separately, and putting the daemon in queue-only mode, sendmail has virtually no chance of taking down the machine. Example (assuming a box w/256MB of ram): sendmail -bd -O MaxDaemonChildren=130 -O DeliveryMode=queue sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=30m sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=2h This is what we do at BEST. Once we began doing things this this way, our three (continuously loaded) frontend mail machines never bogged down ever again. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Matthew Dillon wrote: > Give me a shell and I can crash any machine. Oh. ;| > A good example of this is sendmail. Before the MaxDaemonChildren and > MaxArticleSize options, it was possible for sendmail to overcommit a > machine. In this case the overcommit that can occur is with I/O, not > swap. As a general performance rule, you have to set MaxDaemonChildren > and MaxArticleSize to prevent the overcommit from occuring. This is a > function of sendmail, not a function of the kernel. Sigh. ((c)you) Sendmail can overcommit a machine with right set of MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen such situations. MaxDaemonChildren limits only number of main processes for incoming connections (plus queue run processes). For each connection, after "main from:" and until accepting message, server process for incoming connection forks child which accepts recipient list and letter body. After message accepting, that child can fork delivery process. A queue run process with "O ForkEachJob=true" option, which is default, can create a delivery process for each queue job (in my practice, queue of more than 1000 jobs is ordinary event). All these forks depend only on one test - get current LA and compare it with QueueLA - which fail when high load appeared less than one minute ago. To prevent its overcommit, (I interfere in details with parallel message) the minimal (and possibly not enough) setup set is: 1) patch - insert sm_sleep(1) to server subprocess code before "accepted" reply - limit incoming mail rate; 2) Desrease QueueLA for listening daemon to sub-minimal value (i.e.2); 3) Increase QueueLA for queue running daemon to high values (i.e.50) and set them OForkEachJob=false. But most of these tunings are indirect. A direct tuning invented experimentally on my mail servers is specially hacked pstat program that returns 1 if either swap or file descriptors are used more than 2/3, 0 otherwise; on getting 1, sendmail stops delivering. But, it's pity, this check is unportable. (P.S. Don't tell me change MTA; this is fully another question.) > Another good example is a web server. A web server must have specific > limitations on the number of simultanious connections it is allowed > to handle at once and on the number of CGI's or other auxillary programs > that are allowed to be running at any given time. The overcommit issue > here has nothing to do with swap and everything to do with performance. > Specifically, these limitations exist to avoid cascade failures. As in sendmail case, you propose make some calculations (which are difficult and non-trivial to newbies) to make appreciations of nesessary resources. Another way, which is imho more acceptable, is to provide not hard barriers (SIGKILL on overcommitting), but soft barriers (i.e., stop memory allocating for non-wheel users when memory begins to exhaust). Extra 64M of memory or a disk for swap is commonly quite more cheaper than profitloss on critical service crash. > In the same manner any truely critical system server must handle the > resource management itself to deal with all sorts of problem situations, > including memory. You do not need to build any of this control into the > kernel. No, we need it. Not every server can be patched for such tests (due to loss of sources or another reason), not every admin can make nesessary patches. Kernel must help in it. -- Netch To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Matthew Dillon wrote: > Give me a shell and I can crash any machine. Oh. ;| > A good example of this is sendmail. Before the MaxDaemonChildren and > MaxArticleSize options, it was possible for sendmail to overcommit a > machine. In this case the overcommit that can occur is with I/O, not > swap. As a general performance rule, you have to set MaxDaemonChildren > and MaxArticleSize to prevent the overcommit from occuring. This is a > function of sendmail, not a function of the kernel. Sigh. ((c)you) Sendmail can overcommit a machine with right set of MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen such situations. MaxDaemonChildren limits only number of main processes for incoming connections (plus queue run processes). For each connection, after "main from:" and until accepting message, server process for incoming connection forks child which accepts recipient list and letter body. After message accepting, that child can fork delivery process. A queue run process with "O ForkEachJob=true" option, which is default, can create a delivery process for each queue job (in my practice, queue of more than 1000 jobs is ordinary event). All these forks depend only on one test - get current LA and compare it with QueueLA - which fail when high load appeared less than one minute ago. To prevent its overcommit, (I interfere in details with parallel message) the minimal (and possibly not enough) setup set is: 1) patch - insert sm_sleep(1) to server subprocess code before "accepted" reply - limit incoming mail rate; 2) Desrease QueueLA for listening daemon to sub-minimal value (i.e.2); 3) Increase QueueLA for queue running daemon to high values (i.e.50) and set them OForkEachJob=false. But most of these tunings are indirect. A direct tuning invented experimentally on my mail servers is specially hacked pstat program that returns 1 if either swap or file descriptors are used more than 2/3, 0 otherwise; on getting 1, sendmail stops delivering. But, it's pity, this check is unportable. (P.S. Don't tell me change MTA; this is fully another question.) > Another good example is a web server. A web server must have specific > limitations on the number of simultanious connections it is allowed > to handle at once and on the number of CGI's or other auxillary programs > that are allowed to be running at any given time. The overcommit issue > here has nothing to do with swap and everything to do with performance. > Specifically, these limitations exist to avoid cascade failures. As in sendmail case, you propose make some calculations (which are difficult and non-trivial to newbies) to make appreciations of nesessary resources. Another way, which is imho more acceptable, is to provide not hard barriers (SIGKILL on overcommitting), but soft barriers (i.e., stop memory allocating for non-wheel users when memory begins to exhaust). Extra 64M of memory or a disk for swap is commonly quite more cheaper than profitloss on critical service crash. > In the same manner any truely critical system server must handle the > resource management itself to deal with all sorts of problem situations, > including memory. You do not need to build any of this control into the > kernel. No, we need it. Not every server can be patched for such tests (due to loss of sources or another reason), not every admin can make nesessary patches. Kernel must help in it. -- Netch To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Brian F. Feldman wrote: >> There are other ways. For example, even if a user account is resource >> limited, root processes (such as sendmail, popper, identd, and so forth) >> are not. Attacks against these servers generally result in very high >> loads and sometimes make it difficult to login to fix the problem, but do >> not result in running out of swap. It results sometimes in out of swap, too. > Inetd is rate-limited by default nowadays, so this really doesn't apply. It really does apply. Inetd limits incoming connections per minute, not per second. It is possible to use minute limit in a few seconds and cause a high load. Sendmail is worse than inetd; it cannot limit incoming rate on established connection. Butenko's (bute...@stalker.com) DoS attack to sendmail is to send thousands of letters to local user thru fast netork connection (i.e., Ethernet) thru one established TCP connection; the only barrier is testing of LA before sending '250 XXX message accepted to delivery' reply and fork-and-deliver-or-queue-and-exit decision, but attacker can send too many letters in few seconds; a hundreds of delivery processes locked on /usr/libexec/mail.local mailbox waiting. LA counts system state characteristics of last minute and thus is similar to average patients' temperature per hospital per last year. ;( I have seen a variant of this attack on my mail hosts, when host with 6000 letters in mail queue (mail2news server) sent all its mail to smarthost (uucp spool server); after ~500 letters, sendmail on smarthost closed port 25 on RefuseLA; it was saved from out-of-swap only because domain resolving spent some time. The only mechanism against such type of attack I can imagine is to sm_sleep(1) at "mail from:" smtp server code or before '250 Message accepted for delivery'. For inetd, we must limit connections per second, not per minute. -- Netch To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Brian F. Feldman wrote: >> There are other ways. For example, even if a user account is resource >> limited, root processes (such as sendmail, popper, identd, and so forth) >> are not. Attacks against these servers generally result in very high >> loads and sometimes make it difficult to login to fix the problem, but do >> not result in running out of swap. It results sometimes in out of swap, too. > Inetd is rate-limited by default nowadays, so this really doesn't apply. It really does apply. Inetd limits incoming connections per minute, not per second. It is possible to use minute limit in a few seconds and cause a high load. Sendmail is worse than inetd; it cannot limit incoming rate on established connection. Butenko's ([EMAIL PROTECTED]) DoS attack to sendmail is to send thousands of letters to local user thru fast netork connection (i.e., Ethernet) thru one established TCP connection; the only barrier is testing of LA before sending '250 XXX message accepted to delivery' reply and fork-and-deliver-or-queue-and-exit decision, but attacker can send too many letters in few seconds; a hundreds of delivery processes locked on /usr/libexec/mail.local mailbox waiting. LA counts system state characteristics of last minute and thus is similar to average patients' temperature per hospital per last year. ;( I have seen a variant of this attack on my mail hosts, when host with 6000 letters in mail queue (mail2news server) sent all its mail to smarthost (uucp spool server); after ~500 letters, sendmail on smarthost closed port 25 on RefuseLA; it was saved from out-of-swap only because domain resolving spent some time. The only mechanism against such type of attack I can imagine is to sm_sleep(1) at "mail from:" smtp server code or before '250 Message accepted for delivery'. For inetd, we must limit connections per second, not per minute. -- Netch To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Daniel C. Sobral wrote: > Technical follow-up: > > Contrary to what I previously said, a number of tests reveal that > Solaris, indeed, does not overcommit. All non-read only segments, Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) The memory allocation process is something like this: 1) reserve is allocated from a swap area. Preference is given to swap devices, even if a swap file system has a higher priority. 2) If there is no space on a swap device, swap is allocated from a swap filesystem, if one is configured. If there is nothing to be allocated in a swap filesystem, the kernel attempts to grow the swap file on a filesystem by swchunk (a tunable, default 2MB, I think). (Swap on filesystems starts at zero or swchunck, and is grown as needed up to the limit spec'd at swapon(1M) time.) 3) If this fails, either because there is no space on the file system, or the swapfile has reached its limit, memory (actual core) is allocated. The system tunable swapmem_on determines whether memory is used for swap reserve or not. Default is to use it. 4) If there isn't swap to reserve, the request fails, even if none of the reserved swap is used. The swapinfo(1M) man page makes this quite clear: +Requests for more paging space will fail when they cannot be satisfied by reserving device, file system, or memory paging, even if some of the reserved paging space is not yet in use. Thus it is possible for requests for more paging space to be denied when some, or even all, of the paging areas show zero usage - space in those areas is completely reserved. The upside of this is that if you do run out of swap, the kernel doesn't kill random processes. The downside is, I have seen 4GB boxes, with plenty of swap, run out with less than a gig of memory actually in use. Oh, and if you swap to a filesystem, you can fill it up, without actually using any of the space. I don't know which behaviors is more bogus. David Scheidt To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Daniel C. Sobral wrote: > Technical follow-up: > > Contrary to what I previously said, a number of tests reveal that > Solaris, indeed, does not overcommit. All non-read only segments, Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) The memory allocation process is something like this: 1) reserve is allocated from a swap area. Preference is given to swap devices, even if a swap file system has a higher priority. 2) If there is no space on a swap device, swap is allocated from a swap filesystem, if one is configured. If there is nothing to be allocated in a swap filesystem, the kernel attempts to grow the swap file on a filesystem by swchunk (a tunable, default 2MB, I think). (Swap on filesystems starts at zero or swchunck, and is grown as needed up to the limit spec'd at swapon(1M) time.) 3) If this fails, either because there is no space on the file system, or the swapfile has reached its limit, memory (actual core) is allocated. The system tunable swapmem_on determines whether memory is used for swap reserve or not. Default is to use it. 4) If there isn't swap to reserve, the request fails, even if none of the reserved swap is used. The swapinfo(1M) man page makes this quite clear: +Requests for more paging space will fail when they cannot be satisfied by reserving device, file system, or memory paging, even if some of the reserved paging space is not yet in use. Thus it is possible for requests for more paging space to be denied when some, or even all, of the paging areas show zero usage - space in those areas is completely reserved. The upside of this is that if you do run out of swap, the kernel doesn't kill random processes. The downside is, I have seen 4GB boxes, with plenty of swap, run out with less than a gig of memory actually in use. Oh, and if you swap to a filesystem, you can fill it up, without actually using any of the space. I don't know which behaviors is more bogus. David Scheidt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Can we kill this thread already? This resolves nothing. The only good to come of this is all of the nice doc-proj input Matt is providing (and providing well, I might add.) There is no point that hasn't been rehashed a dozen times over, and you (the ones who want overcommitting turned off) are not helping the S/N ratio. Brian Fundakowski Feldman _ __ ___ ___ ___ ___ gr...@freebsd.org _ __ ___ | _ ) __| \ FreeBSD: The Power to Serve!_ __ | _ \._ \ |) | http://www.FreeBSD.org/ _ |___/___/___/ To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:> I'm sorry, but when you write code for a safety related system you :> do not dynamically allocate memory at all. It's all essentially static. :> There is no issue with the memory resource. Besides, none of the BSD's are :> certified for any of that stuff that I know of. : :Sometimes it's not feasible to statically allocate memory. You :dynamically allocate all the memory you need at program initialization :(and no, we don't want to manage a pool of memory ourselves - that's :what the OS is for). :... :Note that languages such as Ada raise exceptions when memory allocation :fails. The underlying run-time relies on malloc returning null in :order to raise an exception. Normally, programs written in Ada Simply set a resource limit. You are making the classic mistake of assuming that a fail-safe in the O.S. must be integrated all the way down into the user level when, in fact, it is simply a matter of setting a resource limit. When you are running an embedded system and have full control over the software being run, setting resource limits will do what you want. By doing so you are effectively managing the software modules on a module-by-module basis and not allowing one module to indirectly effect another. This is what you want to do in an embedded system: You do not want to create a situation where a failure in one module cascades into others. -Matt Matthew Dillon :take great care to gracefully handle these exceptions. All the C :programs that we've ever written also take great care in handling :NULL returns from malloc. : :I have no problem with overcommit, but I can see the need that :some folks have for turning it off. If you don't want to write :the code to allow this, that's fine - you don't want/need it, :so why should you? But if other folks see a need for it, let :_them_ write the hooks for it :-) : :Dan Eischen :eisc...@vigrid.com : To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Can we kill this thread already? This resolves nothing. The only good to come of this is all of the nice doc-proj input Matt is providing (and providing well, I might add.) There is no point that hasn't been rehashed a dozen times over, and you (the ones who want overcommitting turned off) are not helping the S/N ratio. Brian Fundakowski Feldman _ __ ___ ___ ___ ___ [EMAIL PROTECTED] _ __ ___ | _ ) __| \ FreeBSD: The Power to Serve!_ __ | _ \._ \ |) | http://www.FreeBSD.org/ _ |___/___/___/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> I'm sorry, but when you write code for a safety related system you > do not dynamically allocate memory at all. It's all essentially static. > There is no issue with the memory resource. Besides, none of the BSD's > are > certified for any of that stuff that I know of. Sometimes it's not feasible to statically allocate memory. You dynamically allocate all the memory you need at program initialization (and no, we don't want to manage a pool of memory ourselves - that's what the OS is for). Note that languages such as Ada raise exceptions when memory allocation fails. The underlying run-time relies on malloc returning null in order to raise an exception. Normally, programs written in Ada take great care to gracefully handle these exceptions. All the C programs that we've ever written also take great care in handling NULL returns from malloc. I have no problem with overcommit, but I can see the need that some folks have for turning it off. If you don't want to write the code to allow this, that's fine - you don't want/need it, so why should you? But if other folks see a need for it, let _them_ write the hooks for it :-) Dan Eischen eisc...@vigrid.com To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Matthew Dillon wrote: > >: Well, NetBSD is slated to be used in the 'Space Acceleration >: Measurement System II', measuring the microgravity environment on >: the International Space Station using a distributed system based >: on several NetBSD/i386 boxes. >: >: Sometimes your 'what-if' senarios are others' standard operating >: procedures. >: >: David/absolute >: >: What _is_, what _should be_, and what _could be_ are all distinct. > >Ummm... this doesn't sound like a critical system to me. It sounds like >an experiment. > It's probably an awfully expensive experiment (putting things into space is not cheap)
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
: Well, NetBSD is slated to be used in the 'Space Acceleration : Measurement System II', measuring the microgravity environment on : the International Space Station using a distributed system based : on several NetBSD/i386 boxes. : : Sometimes your 'what-if' senarios are others' standard operating : procedures. : : David/absolute : : What _is_, what _should be_, and what _could be_ are all distinct. Ummm... this doesn't sound like a critical system to me. It sounds like an experiment. None of the BSD's (nor NT, nor any other complex general purpose operating system) are certified for critical systems in space. The reason is simple: None of these operating systems can deal with memory faults caused by radiation. You might see it for internal communications or non-critical sensing, but you aren't going to see it for external communications or thruster control. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:> I'm sorry, but when you write code for a safety related system you :> do not dynamically allocate memory at all. It's all essentially static. :> There is no issue with the memory resource. Besides, none of the BSD's are :> certified for any of that stuff that I know of. : :Sometimes it's not feasible to statically allocate memory. You :dynamically allocate all the memory you need at program initialization :(and no, we don't want to manage a pool of memory ourselves - that's :what the OS is for). :... :Note that languages such as Ada raise exceptions when memory allocation :fails. The underlying run-time relies on malloc returning null in :order to raise an exception. Normally, programs written in Ada Simply set a resource limit. You are making the classic mistake of assuming that a fail-safe in the O.S. must be integrated all the way down into the user level when, in fact, it is simply a matter of setting a resource limit. When you are running an embedded system and have full control over the software being run, setting resource limits will do what you want. By doing so you are effectively managing the software modules on a module-by-module basis and not allowing one module to indirectly effect another. This is what you want to do in an embedded system: You do not want to create a situation where a failure in one module cascades into others. -Matt Matthew Dillon <[EMAIL PROTECTED]> :take great care to gracefully handle these exceptions. All the C :programs that we've ever written also take great care in handling :NULL returns from malloc. : :I have no problem with overcommit, but I can see the need that :some folks have for turning it off. If you don't want to write :the code to allow this, that's fine - you don't want/need it, :so why should you? But if other folks see a need for it, let :_them_ write the hooks for it :-) : :Dan Eischen :[EMAIL PROTECTED] : To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Matthew Dillon wrote: > I'm sorry, but when you write code for a safety related system you > do not dynamically allocate memory at all. It's all essentially static. > There is no issue with the memory resource. Besides, none of the BSD's > are > certified for any of that stuff that I know of. > > What's next: A space shot? These what-if scenarios are getting > ridiculous. Well, NetBSD is slated to be used in the 'Space Acceleration Measurement System II', measuring the microgravity environment on the International Space Station using a distributed system based on several NetBSD/i386 boxes. Sometimes your 'what-if' senarios are others' standard operating procedures. David/absolute What _is_, what _should be_, and what _could be_ are all distinct. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> I'm sorry, but when you write code for a safety related system you > do not dynamically allocate memory at all. It's all essentially static. > There is no issue with the memory resource. Besides, none of the BSD's are > certified for any of that stuff that I know of. Sometimes it's not feasible to statically allocate memory. You dynamically allocate all the memory you need at program initialization (and no, we don't want to manage a pool of memory ourselves - that's what the OS is for). Note that languages such as Ada raise exceptions when memory allocation fails. The underlying run-time relies on malloc returning null in order to raise an exception. Normally, programs written in Ada take great care to gracefully handle these exceptions. All the C programs that we've ever written also take great care in handling NULL returns from malloc. I have no problem with overcommit, but I can see the need that some folks have for turning it off. If you don't want to write the code to allow this, that's fine - you don't want/need it, so why should you? But if other folks see a need for it, let _them_ write the hooks for it :-) Dan Eischen [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Matthew Dillon wrote: > >: Well, NetBSD is slated to be used in the 'Space Acceleration >: Measurement System II', measuring the microgravity environment on >: the International Space Station using a distributed system based >: on several NetBSD/i386 boxes. >: >: Sometimes your 'what-if' senarios are others' standard operating >: procedures. >: >: David/absolute >: >: What _is_, what _should be_, and what _could be_ are all distinct. > >Ummm... this doesn't sound like a critical system to me. It sounds like >an experiment. > It's probably an awfully expensive experiment (putting things into space is not cheap) >From a financial viewpoint that may be considered critical. Cheers, Al -- Alan Horn - Sysadmin - Dreamworks (+1 818 695 6256) - [EMAIL PROTECTED] I am Connor MacLeod of the Clan MacLeod. I was born in 1518 in the village of Glenfinnan on the shores of Loch Sheil, and I am immortal. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
: :For those who wish to develop code for safety related systems that is :not good enough. They have to prove that all code can handle the :degradation :of resources gracefully. Such code relies on guaranteed memory :allocations :or in the very least warnings of memory shortage and prioritized :allocations. :So the least important sub-systems die first. : :--Sean I'm sorry, but when you write code for a safety related system you do not dynamically allocate memory at all. It's all essentially static. There is no issue with the memory resource. Besides, none of the BSD's are certified for any of that stuff that I know of. What's next: A space shot? These what-if scenarios are getting ridiculous. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
: Well, NetBSD is slated to be used in the 'Space Acceleration : Measurement System II', measuring the microgravity environment on : the International Space Station using a distributed system based : on several NetBSD/i386 boxes. : : Sometimes your 'what-if' senarios are others' standard operating : procedures. : : David/absolute : : What _is_, what _should be_, and what _could be_ are all distinct. Ummm... this doesn't sound like a critical system to me. It sounds like an experiment. None of the BSD's (nor NT, nor any other complex general purpose operating system) are certified for critical systems in space. The reason is simple: None of these operating systems can deal with memory faults caused by radiation. You might see it for internal communications or non-critical sensing, but you aren't going to see it for external communications or thruster control. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Fri, 16 Jul 1999, Matthew Dillon wrote: > I'm sorry, but when you write code for a safety related system you > do not dynamically allocate memory at all. It's all essentially static. > There is no issue with the memory resource. Besides, none of the BSD's are > certified for any of that stuff that I know of. > > What's next: A space shot? These what-if scenarios are getting > ridiculous. Well, NetBSD is slated to be used in the 'Space Acceleration Measurement System II', measuring the microgravity environment on the International Space Station using a distributed system based on several NetBSD/i386 boxes. Sometimes your 'what-if' senarios are others' standard operating procedures. David/absolute What _is_, what _should be_, and what _could be_ are all distinct. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
: :For those who wish to develop code for safety related systems that is :not good enough. They have to prove that all code can handle the :degradation :of resources gracefully. Such code relies on guaranteed memory :allocations :or in the very least warnings of memory shortage and prioritized :allocations. :So the least important sub-systems die first. : :--Sean I'm sorry, but when you write code for a safety related system you do not dynamically allocate memory at all. It's all essentially static. There is no issue with the memory resource. Besides, none of the BSD's are certified for any of that stuff that I know of. What's next: A space shot? These what-if scenarios are getting ridiculous. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
"Daniel C. Sobral" wrote: > > It would be nice to have a way to indicate that, a la SIGDANGER. > > Ok, everybody is avoiding this, so I'll comment. Yes, this would be > interesting, and a good implementation will very probably be > committed. *BUT*, this is not as useful as it seems. Since the > correct solution is buy more memory/increase swap (correct solution > for our target markets, anyway), there is little incentive to > implement it. > > So, I think people who can answer the above is thinking like "Well, > it is useful, but it's not useful enough for me to spend my time on > it, and I'm sure as hell don't want to write mini-papers on why it's > not that useful". > For those who wish to develop code for safety related systems that is not good enough. They have to prove that all code can handle the degradation of resources gracefully. Such code relies on guaranteed memory allocations or in the very least warnings of memory shortage and prioritized allocations. So the least important sub-systems die first. --Sean To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
"Daniel C. Sobral" wrote: > > It would be nice to have a way to indicate that, a la SIGDANGER. > > Ok, everybody is avoiding this, so I'll comment. Yes, this would be > interesting, and a good implementation will very probably be > committed. *BUT*, this is not as useful as it seems. Since the > correct solution is buy more memory/increase swap (correct solution > for our target markets, anyway), there is little incentive to > implement it. > > So, I think people who can answer the above is thinking like "Well, > it is useful, but it's not useful enough for me to spend my time on > it, and I'm sure as hell don't want to write mini-papers on why it's > not that useful". > For those who wish to develop code for safety related systems that is not good enough. They have to prove that all code can handle the degradation of resources gracefully. Such code relies on guaranteed memory allocations or in the very least warnings of memory shortage and prioritized allocations. So the least important sub-systems die first. --Sean To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Daniel C. Sobral wrote: > > 4.4BSD derived system cannot do this, and have to use different > > machine for such applications. > > Incorrect. We can set *limits* to the users, so they won't be able > to crash down the system. No. Really, not all users are used system in the same time. And it is too cruel to set too small limits. And, average system has user limits quite more than (total_resource*2/3)/n_users (2/3 is sub-optimal modifier). But, if too many users began to use system, they can overflow the resource. Group limits can make problem softer, but not more than a little. I don't remember now English word for soft barrier, the Russian word is 'dempfer' ;) System must provide such soft barrier to prevent overflow long far from the real overflow. Imho, 20% of typical critical resource must be prevented. -- Netch To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Patrick Welche wrote: > > students != hostile users We obviously have known different students... :-) > Making mistakes is part of learning. A hostile user is one which will act in a non-friendly manner. Whether intentionaly or not is irrelevant from the point of view of the administrator, as far as protecting the system goes. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Matthew Dillon wrote: > > :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) > : John Baldwin wrote: > : > : > What does that have to do with overcommit? I student administrate a > undergrad > : > CS lab at a university, and when student's programs misbehaved, they > generate a > : > fault and are killed. The only machines that reboot on us without be > : > explicitly told to are the NT ones, and yes we run FreeBSD. > : > :What does it have to do with overcommit? Everthing in the world! > : > :If you have a lot of users, all of which have buggy programs which eat > :a lot of memory, per-user swap quotas don't necessarily save your butt. > > If every single one of your users is trying to crash your machine daily, > maybe you should consider throwing them off the system and finding users > that are less hostile. > > This conversation is getting silly. Do you actually believe that > an operating system can magically protect itself 100% from armloads of > hostile users? > > Give me a break. You people are crazy. If you have something worthwhile > to say i'll listen, but these "the sky is falling!" arguments are idiotic. > > -Matt > students != hostile users Making mistakes is part of learning. Patrick To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
c...@netbsd.org (Chris G. Demetriou) writes: > Matthew Dillon writes: > > The text size of a program is irrelevant, because swap is never > > allocated for it. The data and BSS are only relevant when they No, you can mprotect read-only vnode mappings to writable. Most things wouldn't be hurt badly if this changed, though, I suspect that this already varies between operating systems. > > are modified. > > > > The only thing swap is ever used for is the dynamic allocation of > > memory. > > There are three ways to do it: sbrk(), mmap(... MAP_ANON), or > > mmap(... MAP_PRIVATE). > yup, almost: not all MAP_PRIVATE mappings need backing store, only > MAP_PRIVATE and writeable mappings. (MAP_PRIVATE does _not_ guarantee > that you won't see modifications made via other MAP_SHARED mappings.) ...but in *this* case, you certainly shouldn't allow mprotect to fail (with what, ENOMEM?). It's certainly counterintuitive to me that mprotect could fail due to a resource shortage. > Actually, only now have you brought that up. And, that's very system > dependent. On NetBSD/i386 the default is 2MB, and, it's worth noting > that you only need to reserve as much as the current stack limit > allows (after that, you're going to get a signal anyway, and if more So what setrlimit accepts depends on how much memory is available? Ok, programs changing their stack limit are rare, but this would still be another API change. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Daniel C. Sobral wrote: > > 4.4BSD derived system cannot do this, and have to use different > > machine for such applications. > > Incorrect. We can set *limits* to the users, so they won't be able > to crash down the system. No. Really, not all users are used system in the same time. And it is too cruel to set too small limits. And, average system has user limits quite more than (total_resource*2/3)/n_users (2/3 is sub-optimal modifier). But, if too many users began to use system, they can overflow the resource. Group limits can make problem softer, but not more than a little. I don't remember now English word for soft barrier, the Russian word is 'dempfer' ;) System must provide such soft barrier to prevent overflow long far from the real overflow. Imho, 20% of typical critical resource must be prevented. -- Netch To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
jul...@whistle.com (Julian Elischer) writes: > If you wanted to fix this, you could add a patch to malloc that touched > every page that it handed to the application. (and trapped sig11s) How would you expect that to work? Several misunderstandings seem to be common regarding this issue (most not directed at you): - malloc almost never fails with NULL. This is not true, if resource limits are set properly, any one program using huge amounts of memory is going to hit them long before swap space is exhausted. - The program currently trying to get the page is the one that is killed. - Actually paging in all memory is going to protect a program from getting killed. This is going to make it *more likely* for it to be killed. - Not overcommitting doesn't consume huge amounts of reserve space unless programs do something special. A rough sum of memory usage can be computed by summing up all of the process VSZs plus your stack limit times the number of processes. How many of you would be willing to configure that much swap space? If you really wanted to run without overcommit, you'd only run statically linked binaries and set your stack limits to small values. This could be desirable for some (but not general-purpose) systems, an option for doing this wouldn't be entirely bogus. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Daniel C. Sobral wrote: > Eh? Reasonable programs *never* run into trouble. Trouble only > happens when you have unreasonable programs around, or did not > configure the system correctly. And if you did not configure the > system correctly, why do you think you would be able to correctly > estimate the stack needed for the various programs? Your words are bad words. Exhausting of any of main resources - virtual memory, disk space, process descriptors, file descriptors - is a terrible situation, but one must not fight against headache with headcutting. Every system can fall in uncontrolled state and eat all of some resource, and kernel stack is to prevent process pool part from this, not to destruct it. I had seen two boxes where swap was out misfortunately with bad results: on first (FreeBSD 2.2.7), system kills the cron (sic!) process, on second (Linux) syslogd, sendmail and some others became poisoned without any warnings. It is totally bad behavior; kernel must be friend, not enemy. Actions supposed enough by me for first (!) time: 1) Count in some kernel variables (readable by sysctl) overflows of virtual memory, file descriptors, process descriptors and other critical resources. This data must be available for watchdogs; for some systems, it is right to reboot them immediately after some overflow, not to try to work in poisoned state. 2) Run (in standard setup!) cron, syslogd and other important daemons from special init slot (as Linux and possibly other systems allow), not from startup scripts. Reason: they must be restarted when die without admin intervention and without wrappers which can also be killed on memory low. 3) Declare thresholds for critical resources; for example, when more than 80% of virtual memory is used, prevent everybody except euid==0 or egid==0 from allocating new memory. 4) Provide special signal (SIGXMEM?) to send messages that there is memory low and all have to shorten their memory. Daemons should interpret this signal similarly to SIGHUP, with exec() itself and restart. > Now comes the people saying "don't overcommit in *this* case, and > overcommit in *that* case". Irrelevant. Programs are still getting > killed because memory was overcommitted (with the added disadvantage > of you not having as much memory as in a full overcommit mode). Kernel can kill processes that try to get unexistent memory. But when it did not prevent system from falling into overflow, it plays unfair game. -- Netch To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Matthew Dillon wrote: > > Something is weird here. If the solaris people are using a > SWAPSIZE + REALMEM VM model, they have to allow the > allocated + reserved space go +REALMEM bytes over available swap > space. If not they are using only a SWAPSIZE VM model. I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE model. Anyway, I think you are assuming that the "swap -s" command shows as total memory just the swap space... Maybe, maybe not. I don't know. But the space against which I reached the ceiling *was* the one reported in the "swap -s" command. > Wait - does Solaris normally use swap files or swap partitions? > Or is it that weird /tmp filesystem stuff? If it normally uses swap > files and allows holes then that explains everything. I'd say partitions. While perusing man pages, I caught briefly the comment that a swap partition could overwrite a normal partition, in a man page about a special command to create swap partitions. Anything you'd like me to check in particular? If you have any source code you'd like me to run, just send it to c...@comp.cs.gunma-u.ac.jp, though I can only run them at the earliest on monday. Well, at least my monday is your sunday night... :-) -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Patrick Welche wrote: > > students != hostile users We obviously have known different students... :-) > Making mistakes is part of learning. A hostile user is one which will act in a non-friendly manner. Whether intentionaly or not is irrelevant from the point of view of the administrator, as far as protecting the system goes. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Matthew Dillon wrote: > > :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) > : John Baldwin <[EMAIL PROTECTED]> wrote: > : > : > What does that have to do with overcommit? I student administrate a undergrad > : > CS lab at a university, and when student's programs misbehaved, they generate a > : > fault and are killed. The only machines that reboot on us without be > : > explicitly told to are the NT ones, and yes we run FreeBSD. > : > :What does it have to do with overcommit? Everthing in the world! > : > :If you have a lot of users, all of which have buggy programs which eat > :a lot of memory, per-user swap quotas don't necessarily save your butt. > > If every single one of your users is trying to crash your machine daily, > maybe you should consider throwing them off the system and finding users > that are less hostile. > > This conversation is getting silly. Do you actually believe that > an operating system can magically protect itself 100% from armloads of > hostile users? > > Give me a break. You people are crazy. If you have something worthwhile > to say i'll listen, but these "the sky is falling!" arguments are idiotic. > > -Matt > students != hostile users Making mistakes is part of learning. Patrick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
[EMAIL PROTECTED] (Chris G. Demetriou) writes: > Matthew Dillon <[EMAIL PROTECTED]> writes: > > The text size of a program is irrelevant, because swap is never > > allocated for it. The data and BSS are only relevant when they No, you can mprotect read-only vnode mappings to writable. Most things wouldn't be hurt badly if this changed, though, I suspect that this already varies between operating systems. > > are modified. > > > > The only thing swap is ever used for is the dynamic allocation of memory. > > There are three ways to do it: sbrk(), mmap(... MAP_ANON), or > > mmap(... MAP_PRIVATE). > yup, almost: not all MAP_PRIVATE mappings need backing store, only > MAP_PRIVATE and writeable mappings. (MAP_PRIVATE does _not_ guarantee > that you won't see modifications made via other MAP_SHARED mappings.) ...but in *this* case, you certainly shouldn't allow mprotect to fail (with what, ENOMEM?). It's certainly counterintuitive to me that mprotect could fail due to a resource shortage. > Actually, only now have you brought that up. And, that's very system > dependent. On NetBSD/i386 the default is 2MB, and, it's worth noting > that you only need to reserve as much as the current stack limit > allows (after that, you're going to get a signal anyway, and if more So what setrlimit accepts depends on how much memory is available? Ok, programs changing their stack limit are rare, but this would still be another API change. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
[cc: list trimmed] On Thu, 15 Jul 1999 lyn...@orthanc.ab.ca wrote: > > In that scenario, the 512MB of swap I assigned to this machine would be > > dangerously low. > > With 13GB disks available for a couple of hundred bucks, my machines aren't > going to run out of swap space any time soon, even if I commit to disk. > > All I want for Christmas is a knob to disable overcommit. > > --lyndon > CVSup the source repository and start writing. Sander There is no love, no good, no happiness and no future - all these are just illusions. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
[EMAIL PROTECTED] (Julian Elischer) writes: > If you wanted to fix this, you could add a patch to malloc that touched > every page that it handed to the application. (and trapped sig11s) How would you expect that to work? Several misunderstandings seem to be common regarding this issue (most not directed at you): - malloc almost never fails with NULL. This is not true, if resource limits are set properly, any one program using huge amounts of memory is going to hit them long before swap space is exhausted. - The program currently trying to get the page is the one that is killed. - Actually paging in all memory is going to protect a program from getting killed. This is going to make it *more likely* for it to be killed. - Not overcommitting doesn't consume huge amounts of reserve space unless programs do something special. A rough sum of memory usage can be computed by summing up all of the process VSZs plus your stack limit times the number of processes. How many of you would be willing to configure that much swap space? If you really wanted to run without overcommit, you'd only run statically linked binaries and set your stack limits to small values. This could be desirable for some (but not general-purpose) systems, an option for doing this wouldn't be entirely bogus. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Daniel C. Sobral wrote: > Eh? Reasonable programs *never* run into trouble. Trouble only > happens when you have unreasonable programs around, or did not > configure the system correctly. And if you did not configure the > system correctly, why do you think you would be able to correctly > estimate the stack needed for the various programs? Your words are bad words. Exhausting of any of main resources - virtual memory, disk space, process descriptors, file descriptors - is a terrible situation, but one must not fight against headache with headcutting. Every system can fall in uncontrolled state and eat all of some resource, and kernel stack is to prevent process pool part from this, not to destruct it. I had seen two boxes where swap was out misfortunately with bad results: on first (FreeBSD 2.2.7), system kills the cron (sic!) process, on second (Linux) syslogd, sendmail and some others became poisoned without any warnings. It is totally bad behavior; kernel must be friend, not enemy. Actions supposed enough by me for first (!) time: 1) Count in some kernel variables (readable by sysctl) overflows of virtual memory, file descriptors, process descriptors and other critical resources. This data must be available for watchdogs; for some systems, it is right to reboot them immediately after some overflow, not to try to work in poisoned state. 2) Run (in standard setup!) cron, syslogd and other important daemons from special init slot (as Linux and possibly other systems allow), not from startup scripts. Reason: they must be restarted when die without admin intervention and without wrappers which can also be killed on memory low. 3) Declare thresholds for critical resources; for example, when more than 80% of virtual memory is used, prevent everybody except euid==0 or egid==0 from allocating new memory. 4) Provide special signal (SIGXMEM?) to send messages that there is memory low and all have to shorten their memory. Daemons should interpret this signal similarly to SIGHUP, with exec() itself and restart. > Now comes the people saying "don't overcommit in *this* case, and > overcommit in *that* case". Irrelevant. Programs are still getting > killed because memory was overcommitted (with the added disadvantage > of you not having as much memory as in a full overcommit mode). Kernel can kill processes that try to get unexistent memory. But when it did not prevent system from falling into overflow, it plays unfair game. -- Netch To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Matthew Dillon wrote: > > Something is weird here. If the solaris people are using a > SWAPSIZE + REALMEM VM model, they have to allow the > allocated + reserved space go +REALMEM bytes over available swap > space. If not they are using only a SWAPSIZE VM model. I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE model. Anyway, I think you are assuming that the "swap -s" command shows as total memory just the swap space... Maybe, maybe not. I don't know. But the space against which I reached the ceiling *was* the one reported in the "swap -s" command. > Wait - does Solaris normally use swap files or swap partitions? > Or is it that weird /tmp filesystem stuff? If it normally uses swap > files and allows holes then that explains everything. I'd say partitions. While perusing man pages, I caught briefly the comment that a swap partition could overwrite a normal partition, in a man page about a special command to create swap partitions. Anything you'd like me to check in particular? If you have any source code you'd like me to run, just send it to [EMAIL PROTECTED], though I can only run them at the earliest on monday. Well, at least my monday is your sunday night... :-) -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
[cc: list trimmed] On Thu, 15 Jul 1999 [EMAIL PROTECTED] wrote: > > In that scenario, the 512MB of swap I assigned to this machine would be > > dangerously low. > > With 13GB disks available for a couple of hundred bucks, my machines aren't > going to run out of swap space any time soon, even if I commit to disk. > > All I want for Christmas is a knob to disable overcommit. > > --lyndon > CVSup the source repository and start writing. Sander There is no love, no good, no happiness and no future - all these are just illusions. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote: > Something is weird here. If the solaris people are using a > SWAPSIZE + REALMEM VM model, they have to allow the > allocated + reserved space go +REALMEM bytes over available swap > space. If not they are using only a SWAPSIZE VM model. > > Wait - does Solaris normally use swap files or swap partitions? > Or is it that weird /tmp filesystem stuff? If it normally uses swap > files and allows holes then that explains everything. No, swap is slice based in Solaris. tmpfs is just a filesystem (much like MFS) which uses swap as backing store. I will admit to never quite understanding the relationship of how much swap tmpfs is willing to steal though... Maybe I should go and read the answerbook (http://docs.sun.com if you want a peek). -- Dom Mitchell -- Palmer & Harvey McLane -- Unix Systems Administrator In Mountain View did Larry Wall Sedately launch a quiet plea: That DOS, the ancient system, shall On boxes pleasureless to all Run Perl though lack they C. -- ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. ** To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote: > Something is weird here. If the solaris people are using a > SWAPSIZE + REALMEM VM model, they have to allow the > allocated + reserved space go +REALMEM bytes over available swap > space. If not they are using only a SWAPSIZE VM model. > > Wait - does Solaris normally use swap files or swap partitions? > Or is it that weird /tmp filesystem stuff? If it normally uses swap > files and allows holes then that explains everything. No, swap is slice based in Solaris. tmpfs is just a filesystem (much like MFS) which uses swap as backing store. I will admit to never quite understanding the relationship of how much swap tmpfs is willing to steal though... Maybe I should go and read the answerbook (http://docs.sun.com if you want a peek). -- Dom Mitchell -- Palmer & Harvey McLane -- Unix Systems Administrator In Mountain View did Larry Wall Sedately launch a quiet plea: That DOS, the ancient system, shall On boxes pleasureless to all Run Perl though lack they C. -- ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. ** To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Technical follow-up: : :Contrary to what I previously said, a number of tests reveal that :Solaris, indeed, does not overcommit. All non-read only segments, :and all malloc()ed memory is reserved upon exec() or fork(), and the :reserved memory is not allowed to exceed the total memory. It makes :extensive use of read only DATA segments, and has a NON_RESERVE :mmap() flag. : :Though the foot firmly planted in my mouth ought to prevent me from :saying anything else, I must say that it does explain a few things :to me... : :-- :Daniel C. Sobral (8-DCS) :d...@newsguy.com Something is weird here. If the solaris people are using a SWAPSIZE + REALMEM VM model, they have to allow the allocated + reserved space go +REALMEM bytes over available swap space. If not they are using only a SWAPSIZE VM model. Wait - does Solaris normally use swap files or swap partitions? Or is it that weird /tmp filesystem stuff? If it normally uses swap files and allows holes then that explains everything. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Technical follow-up: Contrary to what I previously said, a number of tests reveal that Solaris, indeed, does not overcommit. All non-read only segments, and all malloc()ed memory is reserved upon exec() or fork(), and the reserved memory is not allowed to exceed the total memory. It makes extensive use of read only DATA segments, and has a NON_RESERVE mmap() flag. Though the foot firmly planted in my mouth ought to prevent me from saying anything else, I must say that it does explain a few things to me... -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Technical follow-up: : :Contrary to what I previously said, a number of tests reveal that :Solaris, indeed, does not overcommit. All non-read only segments, :and all malloc()ed memory is reserved upon exec() or fork(), and the :reserved memory is not allowed to exceed the total memory. It makes :extensive use of read only DATA segments, and has a NON_RESERVE :mmap() flag. : :Though the foot firmly planted in my mouth ought to prevent me from :saying anything else, I must say that it does explain a few things :to me... : :-- :Daniel C. Sobral (8-DCS) :[EMAIL PROTECTED] Something is weird here. If the solaris people are using a SWAPSIZE + REALMEM VM model, they have to allow the allocated + reserved space go +REALMEM bytes over available swap space. If not they are using only a SWAPSIZE VM model. Wait - does Solaris normally use swap files or swap partitions? Or is it that weird /tmp filesystem stuff? If it normally uses swap files and allows holes then that explains everything. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Technical follow-up: Contrary to what I previously said, a number of tests reveal that Solaris, indeed, does not overcommit. All non-read only segments, and all malloc()ed memory is reserved upon exec() or fork(), and the reserved memory is not allowed to exceed the total memory. It makes extensive use of read only DATA segments, and has a NON_RESERVE mmap() flag. Though the foot firmly planted in my mouth ought to prevent me from saying anything else, I must say that it does explain a few things to me... -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:> In that scenario, the 512MB of swap I assigned to this machine would be :> dangerously low. : :With 13GB disks available for a couple of hundred bucks, my machines aren't :going to run out of swap space any time soon, even if I commit to disk. : :All I want for Christmas is a knob to disable overcommit. : :--lyndon If your machines aren't going to run out of swap, then the overcommit isn't going to hurt you in a million years. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> And what I'm pretty sure the majority of the readers on this list want > is for those of you who really think it's necessary to do it yourselves. > > What? Nobody who wants to disable the policy knows how to do it? Hmmm, I > wonder whether that's significant... Sheldon, if you can't contribute something useful, then shut up. If I have to do it myself, I will. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:> In that scenario, the 512MB of swap I assigned to this machine would be :> dangerously low. : :With 13GB disks available for a couple of hundred bucks, my machines aren't :going to run out of swap space any time soon, even if I commit to disk. : :All I want for Christmas is a knob to disable overcommit. : :--lyndon If your machines aren't going to run out of swap, then the overcommit isn't going to hurt you in a million years. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> And what I'm pretty sure the majority of the readers on this list want > is for those of you who really think it's necessary to do it yourselves. > > What? Nobody who wants to disable the policy knows how to do it? Hmmm, I > wonder whether that's significant... Sheldon, if you can't contribute something useful, then shut up. If I have to do it myself, I will. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> All I want for Christmas is a knob to disable overcommit. And what I'm pretty sure the majority of the readers on this list want is for those of you who really think it's necessary to do it yourselves. What? Nobody who wants to disable the policy knows how to do it? Hmmm, I wonder whether that's significant... that's an impressively bold statement to make. by my reconning, at least 4 people who have posted "wanting no overcommit" are more than capable of programming this for NetBSD. .mrg. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, 15 Jul 1999 17:53:52 CST, lyn...@orthanc.ab.ca wrote: > All I want for Christmas is a knob to disable overcommit. And what I'm pretty sure the majority of the readers on this list want is for those of you who really think it's necessary to do it yourselves. What? Nobody who wants to disable the policy knows how to do it? Hmmm, I wonder whether that's significant... Ciao, Sheldon. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> All I want for Christmas is a knob to disable overcommit. And what I'm pretty sure the majority of the readers on this list want is for those of you who really think it's necessary to do it yourselves. What? Nobody who wants to disable the policy knows how to do it? Hmmm, I wonder whether that's significant... that's an impressively bold statement to make. by my reconning, at least 4 people who have posted "wanting no overcommit" are more than capable of programming this for NetBSD. .mrg. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> In that scenario, the 512MB of swap I assigned to this machine would be > dangerously low. With 13GB disks available for a couple of hundred bucks, my machines aren't going to run out of swap space any time soon, even if I commit to disk. All I want for Christmas is a knob to disable overcommit. --lyndon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Here is what I get from one of BEST's mail & www proxy machines. ~dillon/br adds the object size's together. 'swap' and 'default' objects refers to unbacked VM objects - and none of the processes running fork shared unbacked objects so we don't have to worry about that. The 'swap' designation means that at least one page in the object has been assigned swap. The default designation means that no pages have been assigned swap. The pages can be dirty or clean. Typical /proc/PID/map output looks like this (taken from one of the sendmail processes). The lines I've marked are the ones being counted as unbacked/swap-backed VM. The rest are vnode-backed and not counted. 0x1000 0x4b000 66 0 r-x COW vnode 0x4b0000x4e0003 3 rwx COW vnode 0x4e0000x87000 5343 rwx COW swap <--- 0x870000x373000 738 738 rwx default <--- 0x2004b000 0x2005a000 2 0 r-x COW vnode 0x2005a000 0x2005c000 2 0 rwx COW vnode 0x2005c000 0x20065000 6 2 rwx COW swap <--- 0x20068000 0x2006d000 3 0 r-x COW vnode 0x2006d000 0x2006e000 1 1 rwx COW vnode 0x2006e000 0x200cc00070 0 r-x COW vnode 0x200cc000 0x200d 4 4 rwx COW vnode 0x200d 0x200e7000 8 6 rwx COW swap <--- 0xefbde000 0xefbfe0001414 rwx COW swap <--- proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br 639168K proxy1:/tmp# pstat -s Device 1K-blocks UsedAvail Capacity Type /dev/sd0b 52428812596 511628 2%Interleaved This machine has 256MB of ram of which around 200MB is in use, we will assume the entire 200MB is used by VM spaces for processes. It is an active machine with around 205 processes at the time of the test. So. 200MB of ram + 12MB of swap = 212MB of actual storage being used out of 639MB of total swap-backable VM. About a factor of 3.2:1. Actual swap utilization is sitting at 2%. If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP model, 200MB of ram would be active and 439MB worth of swap would be either allocated or reserved ( though only 12MB would be actually written, that part doesn't change ). 439MB of swap verses 12MB of swap. In that scenario, the 512MB of swap I assigned to this machine would be dangerously low. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, 15 Jul 1999 17:53:52 CST, [EMAIL PROTECTED] wrote: > All I want for Christmas is a knob to disable overcommit. And what I'm pretty sure the majority of the readers on this list want is for those of you who really think it's necessary to do it yourselves. What? Nobody who wants to disable the policy knows how to do it? Hmmm, I wonder whether that's significant... Ciao, Sheldon. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> In that scenario, the 512MB of swap I assigned to this machine would be > dangerously low. With 13GB disks available for a couple of hundred bucks, my machines aren't going to run out of swap space any time soon, even if I commit to disk. All I want for Christmas is a knob to disable overcommit. --lyndon To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> If this is correct, then solaris is using a VMSPACE = SWAPSPACE > model. FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model. AFAIK it has been stated quite explicitly by the Solaris folks that Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different* from SunOS 4.1.x. Steinar Haug, Nethelp consulting, sth...@nethelp.no To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
In article you write: >::-s Print summary information about total swap >:: space usage and availability: >:: >:: allocated The total amount of swap space >:: (in 1024-byte blocks) >:: currently allocated for use as >:: backing store. >:: >:: reservedThe total amount of swap space >:: (in 1024-bytes blocks) not >:: currentlyallocated,but >:: claimed by memory mappings for >:: possible future use. >:: >:: usedThe total amount of swap space >:: (in 1024-byte blocks) that is >:: either allocated or reserved. >:-- >:soda > >It would be really easy to test this. > >Write a program that malloc's 32MB of space and touches it, >then sleeps 10 seconds and forks, with both child and parent >sleeping afterwords. ( the parent and the forked child should >not touch the memory after the fork occurs ). > >Do a pstat -s before, after the initial touch, and after >the fork. If you do not see the reserved swap space jump >by 32MB after the fork, it isn't what you thought it was. aladdin[5:32pm]> prtconf System Configuration: Sun Microsystems i86pc Memory size: 128 Megabytes aladdin[5:41pm]> uname -a SunOS aladdin 5.6 Generic_105182-14 i86pc i386 total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail malloced 32MB... total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail touched... total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail forking... total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail touching again (parent)... touching again (child)... total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail exiting... exiting... total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail -- Jonathan To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Before program start: :total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available : :After malloc, before touch: :total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k available : :After malloc + touch: :total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available : :After fork: :total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k available : :[there has been a little background activity, but the numbers speak for themselves] : : :Daniel Assuming the allocated field is not inclusive of real memory, what we have is swap reservation under solaris for clean pages, and allocation and assignment for dirty pages. The grand total will tell you the total VM potential for malloc'd space but does not appear to tell you how much swap is actually active - i.e. was written to and contains valid data. It would be interesting to see if the stack segment is included in the reservation. Try setting the stack resource limit to 32m and run the same program, except without bothering to malloc() or touch anything. See if the stack segment is included in the reservation field. It would also be interesting to see how solaris deals with MAP_PRIVATE mmap's. If this is correct, then solaris is using a VMSPACE = SWAPSPACE model. FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Here is what I get from one of BEST's mail & www proxy machines. ~dillon/br adds the object size's together. 'swap' and 'default' objects refers to unbacked VM objects - and none of the processes running fork shared unbacked objects so we don't have to worry about that. The 'swap' designation means that at least one page in the object has been assigned swap. The default designation means that no pages have been assigned swap. The pages can be dirty or clean. Typical /proc/PID/map output looks like this (taken from one of the sendmail processes). The lines I've marked are the ones being counted as unbacked/swap-backed VM. The rest are vnode-backed and not counted. 0x1000 0x4b000 66 0 r-x COW vnode 0x4b0000x4e0003 3 rwx COW vnode 0x4e0000x87000 5343 rwx COW swap <--- 0x870000x373000 738 738 rwx default <--- 0x2004b000 0x2005a000 2 0 r-x COW vnode 0x2005a000 0x2005c000 2 0 rwx COW vnode 0x2005c000 0x20065000 6 2 rwx COW swap <--- 0x20068000 0x2006d000 3 0 r-x COW vnode 0x2006d000 0x2006e000 1 1 rwx COW vnode 0x2006e000 0x200cc00070 0 r-x COW vnode 0x200cc000 0x200d 4 4 rwx COW vnode 0x200d 0x200e7000 8 6 rwx COW swap <--- 0xefbde000 0xefbfe0001414 rwx COW swap <--- proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br 639168K proxy1:/tmp# pstat -s Device 1K-blocks UsedAvail Capacity Type /dev/sd0b 52428812596 511628 2%Interleaved This machine has 256MB of ram of which around 200MB is in use, we will assume the entire 200MB is used by VM spaces for processes. It is an active machine with around 205 processes at the time of the test. So. 200MB of ram + 12MB of swap = 212MB of actual storage being used out of 639MB of total swap-backable VM. About a factor of 3.2:1. Actual swap utilization is sitting at 2%. If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP model, 200MB of ram would be active and 439MB worth of swap would be either allocated or reserved ( though only 12MB would be actually written, that part doesn't change ). 439MB of swap verses 12MB of swap. In that scenario, the 512MB of swap I assigned to this machine would be dangerously low. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> If this is correct, then solaris is using a VMSPACE = SWAPSPACE > model. FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model. AFAIK it has been stated quite explicitly by the Solaris folks that Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different* from SunOS 4.1.x. Steinar Haug, Nethelp consulting, [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
In article [EMAIL PROTECTED]> you write: >::-s Print summary information about total swap >:: space usage and availability: >:: >:: allocated The total amount of swap space >:: (in 1024-byte blocks) >:: currently allocated for use as >:: backing store. >:: >:: reservedThe total amount of swap space >:: (in 1024-bytes blocks) not >:: currentlyallocated,but >:: claimed by memory mappings for >:: possible future use. >:: >:: usedThe total amount of swap space >:: (in 1024-byte blocks) that is >:: either allocated or reserved. >:-- >:soda > >It would be really easy to test this. > >Write a program that malloc's 32MB of space and touches it, >then sleeps 10 seconds and forks, with both child and parent >sleeping afterwords. ( the parent and the forked child should >not touch the memory after the fork occurs ). > >Do a pstat -s before, after the initial touch, and after >the fork. If you do not see the reserved swap space jump >by 32MB after the fork, it isn't what you thought it was. aladdin[5:32pm]> prtconf System Configuration: Sun Microsystems i86pc Memory size: 128 Megabytes aladdin[5:41pm]> uname -a SunOS aladdin 5.6 Generic_105182-14 i86pc i386 total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail malloced 32MB... total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail touched... total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail forking... total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail touching again (parent)... touching again (child)... total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail exiting... exiting... total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail -- Jonathan To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Before program start: :total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available : :After malloc, before touch: :total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k available : :After malloc + touch: :total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available : :After fork: :total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k available : :[there has been a little background activity, but the numbers speak for themselves] : : :Daniel Assuming the allocated field is not inclusive of real memory, what we have is swap reservation under solaris for clean pages, and allocation and assignment for dirty pages. The grand total will tell you the total VM potential for malloc'd space but does not appear to tell you how much swap is actually active - i.e. was written to and contains valid data. It would be interesting to see if the stack segment is included in the reservation. Try setting the stack resource limit to 32m and run the same program, except without bothering to malloc() or touch anything. See if the stack segment is included in the reservation field. It would also be interesting to see how solaris deals with MAP_PRIVATE mmap's. If this is correct, then solaris is using a VMSPACE = SWAPSPACE model. FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Wed, 14 Jul 1999, John Nemeth wrote: > On Jul 15, 2:40am, "Daniel C. Sobral" wrote: > } Garance A Drosihn wrote: > } > At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote: > } > > In which case the program that consumed all memory will be killed. > } > > The program killed is +NOT+ the one demanding memory, it's the one > } > > with most of it. > } > > } > But that isn't always the best process to have killed off... > } > } Sure it is. :-) Let's see... > > This statement is absurd. Only a comptetant admin can decide > which process can be killed. No arbitrary decision is going to be > correct. > > } > It would be nice to have a way to indicate that, a la SIGDANGER. How about assigning something like a class to process, which gives VM a hint which processes should be killed first without much thinking, and which the last (or never)? In other words, let's say class 10 means "totally disposable, kill whenever you want", and class 1 means "never try to kill me". Of course, most processes would get some default value, and superuser could "renice" them to more resistant class. This way both sides of the discussion would be satisfied :-) Andrzej Bialecki // WebGiro AB, Sweden (http://www.webgiro.com) // --- // -- FreeBSD: The Power to Serve. http://www.freebsd.org // --- Small & Embedded FreeBSD: http://www.freebsd.org/~picobsd/ To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Wed, 14 Jul 1999, John Nemeth wrote: > On Jul 15, 2:40am, "Daniel C. Sobral" wrote: > } Garance A Drosihn wrote: > } > At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote: > } > > In which case the program that consumed all memory will be killed. > } > > The program killed is +NOT+ the one demanding memory, it's the one > } > > with most of it. > } > > } > But that isn't always the best process to have killed off... > } > } Sure it is. :-) Let's see... > > This statement is absurd. Only a comptetant admin can decide > which process can be killed. No arbitrary decision is going to be > correct. > > } > It would be nice to have a way to indicate that, a la SIGDANGER. How about assigning something like a class to process, which gives VM a hint which processes should be killed first without much thinking, and which the last (or never)? In other words, let's say class 10 means "totally disposable, kill whenever you want", and class 1 means "never try to kill me". Of course, most processes would get some default value, and superuser could "renice" them to more resistant class. This way both sides of the discussion would be satisfied :-) Andrzej Bialecki // <[EMAIL PROTECTED]> WebGiro AB, Sweden (http://www.webgiro.com) // --- // -- FreeBSD: The Power to Serve. http://www.freebsd.org // --- Small & Embedded FreeBSD: http://www.freebsd.org/~picobsd/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
::-s Print summary information about total swap :: space usage and availability: :: :: allocated The total amount of swap space :: (in 1024-byte blocks) :: currently allocated for use as :: backing store. :: :: reservedThe total amount of swap space :: (in 1024-bytes blocks) not :: currentlyallocated,but :: claimed by memory mappings for :: possible future use. :: :: usedThe total amount of swap space :: (in 1024-byte blocks) that is :: either allocated or reserved. :-- :soda It would be really easy to test this. Write a program that malloc's 32MB of space and touches it, then sleeps 10 seconds and forks, with both child and parent sleeping afterwords. ( the parent and the forked child should not touch the memory after the fork occurs ). Do a pstat -s before, after the initial touch, and after the fork. If you do not see the reserved swap space jump by 32MB after the fork, it isn't what you thought it was. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page: : ::-s Print summary information about total swap :: space usage and availability: :: :: allocated The total amount of swap space :: (in 1024-byte blocks) :: currently allocated for use as :: backing store. :: :: reservedThe total amount of swap space :: (in 1024-bytes blocks) not :: currentlyallocated,but :: claimed by memory mappings for :: possible future use. :: :: usedThe total amount of swap space :: (in 1024-byte blocks) that is :: either allocated or reserved. :-- :soda Yah, that's what I thought. A solaris expert could tell us for sure but I am pretty sure those are simply cached swap blocks after-the-fact, not actual reservations on potentially swappable space. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> On Thu, 15 Jul 1999 11:09:01 -0700 (PDT), Matthew Dillon said: > Umm... how are you getting the reserved numbers? "pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page: :-s Print summary information about total swap : space usage and availability: : : allocated The total amount of swap space : (in 1024-byte blocks) : currently allocated for use as : backing store. : : reservedThe total amount of swap space : (in 1024-bytes blocks) not : currentlyallocated,but : claimed by memory mappings for : possible future use. : : usedThe total amount of swap space : (in 1024-byte blocks) that is : either allocated or reserved. -- soda To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits", :this fact seems to be somewhat suggestive. : :And also, the followings are allocated memory and reserved memory :in my environment. (This table also includes Eduardo's example) : : SunOS allocated reservedtotal total/allocated : - - : 4.1.4 4268k1248k5516k 1.2924 : 4.1.2 7732k1492k9224k 1.193 : 4.1.4 8848k3080k 11928k 1.3481 : 4.1.4 13532k6772k 20304k 1.5004 : 5.5.1 15312k5092k 20404k 1.3325 : 4.1.3 16112k6512k 22624k 1.4042 : 4.1.2 26356k1620k 27976k 1.0615 : 4.1.4 26560k3756k 30316k 1.1414 : 5.526076k 11348k 37424k 1.4352 : 4.1.4 32984k5556k 38540k 1.1684 : 5.632448k7072k 39520k 1.2179 : 4.1.4 38056k3692k 41748k 1.097 : 4.1.4 49064k7672k 56736k 1.1564 : 4.1.4 67012k7800k 74812k 1.1164 : 4.1.4 99348k 16956k 116304k 1.1707 : 4.1.4 118288k 11780k 130068k 1.0996 : 5.6 231968k 18880k 250848k 1.0814 : 5.7 307240k 19464k 326704k 1.0634 : : (sorted by total amount of used swap) : :In those examples, non-overcommiting system requires 1.06x ... 1.50x :... :soda Umm... how are you getting the reserved numbers? Are you sure that isn't simply cached swap blocks? I.E. when something gets swapped out and then is swapped back in and dirtied, Solaris may be holding the swap block assignment rather then letting it go. FreeBSD-stable does the same thing. FreeBSD-current does not -- it lets it go in order to be able to reallocate it later as part of a contiguous swath for performance reasons. These 'extra' swap blocks are effectively reserved but not actually allocated. They can be reassigned. The numbers above are very similar to what you would see in a redirtying-cache swap block situation on a FreeBSD-stable system. If I add up all the unshared writeable segments on my home box - that is, all segments for which one would potentially have to reserve swap space - I get a total of around 382MB. The machine is currently eating around 100MB of ram and 5MB of swap, or around a 3.5:1 ratio in this case. A non-overcommit model would have to reserve swap space for 382MB - 100MB = 282MB verses the 5MB of swap the machine actually allocates. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> On Thu, 15 Jul 1999, Daniel C. Sobral wrote: >> Uh... like any modern unix, Solaris overcommits. > On Thu, 15 Jul 1999 08:46:36 -0700 (PDT), "Eduardo E. Horvath" said: > Where do you guys get this misinformation? : > Note the `19464k reserved'; that space has been reserved but not yet > allocated. Both Dillon and Sobral mistakenly claimed that "Solaris overcommits", this fact seems to be somewhat suggestive. And also, the followings are allocated memory and reserved memory in my environment. (This table also includes Eduardo's example) SunOS allocated reservedtotal total/allocated - - 4.1.4 4268k1248k5516k 1.2924 4.1.2 7732k1492k9224k 1.193 4.1.4 8848k3080k 11928k 1.3481 4.1.4 13532k6772k 20304k 1.5004 5.5.1 15312k5092k 20404k 1.3325 4.1.3 16112k6512k 22624k 1.4042 4.1.2 26356k1620k 27976k 1.0615 4.1.4 26560k3756k 30316k 1.1414 5.526076k 11348k 37424k 1.4352 4.1.4 32984k5556k 38540k 1.1684 5.632448k7072k 39520k 1.2179 4.1.4 38056k3692k 41748k 1.097 4.1.4 49064k7672k 56736k 1.1564 4.1.4 67012k7800k 74812k 1.1164 4.1.4 99348k 16956k 116304k 1.1707 4.1.4 118288k 11780k 130068k 1.0996 5.6 231968k 18880k 250848k 1.0814 5.7 307240k 19464k 326704k 1.0634 (sorted by total amount of used swap) In those examples, non-overcommiting system requires 1.06x ... 1.50x more swap space than overcommiting system. This table also indicates that in proportion as total used swap increase the ratio will decrease. And extra swap space required on non-overcommiting system is approximately several tens mega bytes. i.e. The extra cost of non-overcommiting system is less than ten dollers in my environment. Matt Dillon claimed that non-overcommiting system requires 8x or more swap space than overcommiting system. That's just wrong as above. (There might be cases which requires 8x swap, but it is not typical like Dillon said.) If you don't want non-overcommiting system, because you don't want to pay it's cost. That's OK, but please don't force us to accept your limited view. -- soda To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
::-s Print summary information about total swap :: space usage and availability: :: :: allocated The total amount of swap space :: (in 1024-byte blocks) :: currently allocated for use as :: backing store. :: :: reservedThe total amount of swap space :: (in 1024-bytes blocks) not :: currentlyallocated,but :: claimed by memory mappings for :: possible future use. :: :: usedThe total amount of swap space :: (in 1024-byte blocks) that is :: either allocated or reserved. :-- :soda It would be really easy to test this. Write a program that malloc's 32MB of space and touches it, then sleeps 10 seconds and forks, with both child and parent sleeping afterwords. ( the parent and the forked child should not touch the memory after the fork occurs ). Do a pstat -s before, after the initial touch, and after the fork. If you do not see the reserved swap space jump by 32MB after the fork, it isn't what you thought it was. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page: : ::-s Print summary information about total swap :: space usage and availability: :: :: allocated The total amount of swap space :: (in 1024-byte blocks) :: currently allocated for use as :: backing store. :: :: reservedThe total amount of swap space :: (in 1024-bytes blocks) not :: currentlyallocated,but :: claimed by memory mappings for :: possible future use. :: :: usedThe total amount of swap space :: (in 1024-byte blocks) that is :: either allocated or reserved. :-- :soda Yah, that's what I thought. A solaris expert could tell us for sure but I am pretty sure those are simply cached swap blocks after-the-fact, not actual reservations on potentially swappable space. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> On Thu, 15 Jul 1999 11:09:01 -0700 (PDT), Matthew Dillon <[EMAIL PROTECTED]> said: > Umm... how are you getting the reserved numbers? "pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page: :-s Print summary information about total swap : space usage and availability: : : allocated The total amount of swap space : (in 1024-byte blocks) : currently allocated for use as : backing store. : : reservedThe total amount of swap space : (in 1024-bytes blocks) not : currentlyallocated,but : claimed by memory mappings for : possible future use. : : usedThe total amount of swap space : (in 1024-byte blocks) that is : either allocated or reserved. -- soda To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits", :this fact seems to be somewhat suggestive. : :And also, the followings are allocated memory and reserved memory :in my environment. (This table also includes Eduardo's example) : : SunOS allocated reservedtotal total/allocated : - - : 4.1.4 4268k1248k5516k 1.2924 : 4.1.2 7732k1492k9224k 1.193 : 4.1.4 8848k3080k 11928k 1.3481 : 4.1.4 13532k6772k 20304k 1.5004 : 5.5.1 15312k5092k 20404k 1.3325 : 4.1.3 16112k6512k 22624k 1.4042 : 4.1.2 26356k1620k 27976k 1.0615 : 4.1.4 26560k3756k 30316k 1.1414 : 5.526076k 11348k 37424k 1.4352 : 4.1.4 32984k5556k 38540k 1.1684 : 5.632448k7072k 39520k 1.2179 : 4.1.4 38056k3692k 41748k 1.097 : 4.1.4 49064k7672k 56736k 1.1564 : 4.1.4 67012k7800k 74812k 1.1164 : 4.1.4 99348k 16956k 116304k 1.1707 : 4.1.4 118288k 11780k 130068k 1.0996 : 5.6 231968k 18880k 250848k 1.0814 : 5.7 307240k 19464k 326704k 1.0634 : : (sorted by total amount of used swap) : :In those examples, non-overcommiting system requires 1.06x ... 1.50x :... :soda Umm... how are you getting the reserved numbers? Are you sure that isn't simply cached swap blocks? I.E. when something gets swapped out and then is swapped back in and dirtied, Solaris may be holding the swap block assignment rather then letting it go. FreeBSD-stable does the same thing. FreeBSD-current does not -- it lets it go in order to be able to reallocate it later as part of a contiguous swath for performance reasons. These 'extra' swap blocks are effectively reserved but not actually allocated. They can be reassigned. The numbers above are very similar to what you would see in a redirtying-cache swap block situation on a FreeBSD-stable system. If I add up all the unshared writeable segments on my home box - that is, all segments for which one would potentially have to reserve swap space - I get a total of around 382MB. The machine is currently eating around 100MB of ram and 5MB of swap, or around a 3.5:1 ratio in this case. A non-overcommit model would have to reserve swap space for 382MB - 100MB = 282MB verses the 5MB of swap the machine actually allocates. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
> On Thu, 15 Jul 1999, Daniel C. Sobral wrote: >> Uh... like any modern unix, Solaris overcommits. > On Thu, 15 Jul 1999 08:46:36 -0700 (PDT), "Eduardo E. Horvath" <[EMAIL PROTECTED]> said: > Where do you guys get this misinformation? : > Note the `19464k reserved'; that space has been reserved but not yet > allocated. Both Dillon and Sobral mistakenly claimed that "Solaris overcommits", this fact seems to be somewhat suggestive. And also, the followings are allocated memory and reserved memory in my environment. (This table also includes Eduardo's example) SunOS allocated reservedtotal total/allocated - - 4.1.4 4268k1248k5516k 1.2924 4.1.2 7732k1492k9224k 1.193 4.1.4 8848k3080k 11928k 1.3481 4.1.4 13532k6772k 20304k 1.5004 5.5.1 15312k5092k 20404k 1.3325 4.1.3 16112k6512k 22624k 1.4042 4.1.2 26356k1620k 27976k 1.0615 4.1.4 26560k3756k 30316k 1.1414 5.526076k 11348k 37424k 1.4352 4.1.4 32984k5556k 38540k 1.1684 5.632448k7072k 39520k 1.2179 4.1.4 38056k3692k 41748k 1.097 4.1.4 49064k7672k 56736k 1.1564 4.1.4 67012k7800k 74812k 1.1164 4.1.4 99348k 16956k 116304k 1.1707 4.1.4 118288k 11780k 130068k 1.0996 5.6 231968k 18880k 250848k 1.0814 5.7 307240k 19464k 326704k 1.0634 (sorted by total amount of used swap) In those examples, non-overcommiting system requires 1.06x ... 1.50x more swap space than overcommiting system. This table also indicates that in proportion as total used swap increase the ratio will decrease. And extra swap space required on non-overcommiting system is approximately several tens mega bytes. i.e. The extra cost of non-overcommiting system is less than ten dollers in my environment. Matt Dillon claimed that non-overcommiting system requires 8x or more swap space than overcommiting system. That's just wrong as above. (There might be cases which requires 8x swap, but it is not typical like Dillon said.) If you don't want non-overcommiting system, because you don't want to pay it's cost. That's OK, but please don't force us to accept your limited view. -- soda To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
At 6:29 PM -0700 7/14/99, Matthew Dillon wrote: >If 1G isn't enough, spend another $30 and throw 2G of swap >online. Or perhaps dedicate an entire $150 disk and throw >6+ GB of swap online. > >The equivalent setup using a non-overcommit model would require >considerably more swap to have the same reliability. Please note that we're talking at cross-purposes here, mainly because I didn't realize this same general topic was being beaten to death in the 'replacement for grep' thread (which I have not been following). Speaking for just me myself and I, I have no problems with the current overcommit model. All I'd like to do is have a way to indicate which processes should not get booted first, if the system does indeed run out of swap and needs to boot some processes. However, other people seem much more worked up about this topic than I am, and thus what I (personally) meant as "just casual questions" seem to be taken as "demands that something be done, RIGHT NOW". I now realize that some people are arguing that malloc should return an error if the system runs out of space, but that's not what I am thinking about. So, I think I'll bow out of this discussion for now, and maybe try to discuss my "casual questions" sometime in a different context... --- Garance Alistair Drosehn = g...@eclipse.acs.rpi.edu Senior Systems Programmer or dro...@rpi.edu Rensselaer Polytechnic Institute To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Kevin Schoedel wrote: > > >Imagine a reasonably big > >program, like Netscape or Emacs, of which you usually just use a > >subset of features. There can easily be many megabytes of code and > >data in them you never actually use, or you don't _usually_ use > >(like the people who use emacs like it was vi :). Without > >overcommit, you need to allocate all that memory for the code, no > >matter whether you end up using it or not. With overcommit, there is > >no such problem. > > Code, static data, and not-yet-written writable data should be backed by > the executable file, not by swap space, so unused code and tables should > not be a problem. TEXT should be backed by the executable, as long a the program doesn't change it to read/write. That's not the code I was refering to. Not-yet-written blah-blah-blah should be backed by: 1) The executable file if you are overcommitting. 2) RAM/Swap if you are not. If you don't do this, you are overcommitting. Proof: let the system exaust it's memory. Change a single byte in the not-yet-written stuff. Now you need more memory than you have to comply with a regular operation (like changing the value of a global variable), which means you overcommitted. Now comes the people saying "don't overcommit in *this* case, and overcommit in *that* case". Irrelevant. Programs are still getting killed because memory was overcommitted (with the added disadvantage of you not having as much memory as in a full overcommit mode). > Stack is more interesting. There might be a place for a global overcommit > switch. I think I'd be happier with a scheme in which stack the first > page or first few pages are committed (so that reasonable programs will > never run into trouble) and remaining stack is over-/un-committed by > default, along with means for unusual programs to commit (and/or test > commitability of) subsequent pages. Eh? Reasonable programs *never* run into trouble. Trouble only happens when you have unreasonable programs around, or did not configure the system correctly. And if you did not configure the system correctly, why do you think you would be able to correctly estimate the stack needed for the various programs? -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
At 6:29 PM -0700 7/14/99, Matthew Dillon wrote: >If 1G isn't enough, spend another $30 and throw 2G of swap >online. Or perhaps dedicate an entire $150 disk and throw >6+ GB of swap online. > >The equivalent setup using a non-overcommit model would require >considerably more swap to have the same reliability. Please note that we're talking at cross-purposes here, mainly because I didn't realize this same general topic was being beaten to death in the 'replacement for grep' thread (which I have not been following). Speaking for just me myself and I, I have no problems with the current overcommit model. All I'd like to do is have a way to indicate which processes should not get booted first, if the system does indeed run out of swap and needs to boot some processes. However, other people seem much more worked up about this topic than I am, and thus what I (personally) meant as "just casual questions" seem to be taken as "demands that something be done, RIGHT NOW". I now realize that some people are arguing that malloc should return an error if the system runs out of space, but that's not what I am thinking about. So, I think I'll bow out of this discussion for now, and maybe try to discuss my "casual questions" sometime in a different context... --- Garance Alistair Drosehn = [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Institute To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
Kevin Schoedel wrote: > > >Imagine a reasonably big > >program, like Netscape or Emacs, of which you usually just use a > >subset of features. There can easily be many megabytes of code and > >data in them you never actually use, or you don't _usually_ use > >(like the people who use emacs like it was vi :). Without > >overcommit, you need to allocate all that memory for the code, no > >matter whether you end up using it or not. With overcommit, there is > >no such problem. > > Code, static data, and not-yet-written writable data should be backed by > the executable file, not by swap space, so unused code and tables should > not be a problem. TEXT should be backed by the executable, as long a the program doesn't change it to read/write. That's not the code I was refering to. Not-yet-written blah-blah-blah should be backed by: 1) The executable file if you are overcommitting. 2) RAM/Swap if you are not. If you don't do this, you are overcommitting. Proof: let the system exaust it's memory. Change a single byte in the not-yet-written stuff. Now you need more memory than you have to comply with a regular operation (like changing the value of a global variable), which means you overcommitted. Now comes the people saying "don't overcommit in *this* case, and overcommit in *that* case". Irrelevant. Programs are still getting killed because memory was overcommitted (with the added disadvantage of you not having as much memory as in a full overcommit mode). > Stack is more interesting. There might be a place for a global overcommit > switch. I think I'd be happier with a scheme in which stack the first > page or first few pages are committed (so that reasonable programs will > never run into trouble) and remaining stack is over-/un-committed by > default, along with means for unusual programs to commit (and/or test > commitability of) subsequent pages. Eh? Reasonable programs *never* run into trouble. Trouble only happens when you have unreasonable programs around, or did not configure the system correctly. And if you did not configure the system correctly, why do you think you would be able to correctly estimate the stack needed for the various programs? -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Hi everyone, I've been following this discussion almost from the beginning, and I have the feeling that we're not _really_ getting very far. There's good arguments for and against overcommit, depending on your point of view and your requirements. What I do see is a not-so-openly voiced consent that the way resource(sp?) shortages are handled in an overcommitting system (SIGKILL) makes some of us rather unhappy. I therefore suggest those of us who would like to see a change in this area pool their efforts and energies to work on a mechanism that handles resource shortage in a more graceful way. cheerio Michael -- michael.schus...@germany.sun.com To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Hi everyone, I've been following this discussion almost from the beginning, and I have the feeling that we're not _really_ getting very far. There's good arguments for and against overcommit, depending on your point of view and your requirements. What I do see is a not-so-openly voiced consent that the way resource(sp?) shortages are handled in an overcommitting system (SIGKILL) makes some of us rather unhappy. I therefore suggest those of us who would like to see a change in this area pool their efforts and energies to work on a mechanism that handles resource shortage in a more graceful way. cheerio Michael -- [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, Jul 15, 1999 at 01:48:40PM +0900, Daniel C. Sobral wrote: > > > > If you have a lot of users, all of which have buggy programs which eat > > a lot of memory, per-user swap quotas don't necessarily save your butt. > > The chance of these buggy programs running at the same time is not > exactly high... Well, it is higher than your probably giving credit for. Suppose Professor A. hands-out X assignment. Unfortunately, some piece of code he supplied to his, let's say 200 students ignorant first year students, has this particular memory-eating bug. Being ignorant first-year students, they will notice something is wrong, assume the problem is their fault, and repeat the exact same procedure five or so times. Again, being ignorant first year students, they will probably all be using the same shell server. To make things worse, some wise-ass may have told a bunch of them how to use ulimit or limit in order to push their available resources as high as possible (perhaps very high, since the admin hopefully recognizes that sometimes students need high resource limits to perform research). Fortunately, overcommit rescues the machine and kills those buggy programs instead of letting them spin around for ever in some kind of "malloc() failed ... must be temporary failure, wait and retry". -- This is my .signature which gets appended to the end of my messages. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
> > And what do you do, then, with the processes that happen to have > legitimate use for more stack? > > Or maybe you just find out how much stack each process uses, and > then set limits appropriate for each one? Which is the equivalent of > setting limits to each user, of course... You get a little program, like eg. Xenix and Minix had, which lets you modify the executable header to indicate how much stack the system should reserve. If the program decides to use more stack for some reason, then it dies; this is in effect "stack overcommit". 8) -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
On Thu, Jul 15, 1999 at 01:48:40PM +0900, Daniel C. Sobral wrote: > > > > If you have a lot of users, all of which have buggy programs which eat > > a lot of memory, per-user swap quotas don't necessarily save your butt. > > The chance of these buggy programs running at the same time is not > exactly high... Well, it is higher than your probably giving credit for. Suppose Professor A. hands-out X assignment. Unfortunately, some piece of code he supplied to his, let's say 200 students ignorant first year students, has this particular memory-eating bug. Being ignorant first-year students, they will notice something is wrong, assume the problem is their fault, and repeat the exact same procedure five or so times. Again, being ignorant first year students, they will probably all be using the same shell server. To make things worse, some wise-ass may have told a bunch of them how to use ulimit or limit in order to push their available resources as high as possible (perhaps very high, since the admin hopefully recognizes that sometimes students need high resource limits to perform research). Fortunately, overcommit rescues the machine and kills those buggy programs instead of letting them spin around for ever in some kind of "malloc() failed ... must be temporary failure, wait and retry". -- This is my .signature which gets appended to the end of my messages. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
lyn...@orthanc.ab.ca wrote: > > What it so evil about having a reasonably intelligent malloc() that > tells the truth, and returns unused memory to the system? Overcommit > is for lazy programmers, plain and simple. At least the SGI documentation > about overcommit admits that (or at least, did at one time). Yes. So is high-level languages, as a matter of fact. True memory-conscious programmers will never use anything besides assembler. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Jason Thorpe wrote: > > If you have a lot of users, all of which have buggy programs which eat > a lot of memory, per-user swap quotas don't necessarily save your butt. The chance of these buggy programs running at the same time is not exactly high... > And maybe the individual programs didn't encounter their resource limits. > > ...but the sheer number of these runaway things caused the overcommit to > be a problem. If malloc() or whatever had actually returned NULL at the > right time (i.e. as backing store was about to become overcommitted), then > these runaway processes would have stopped running away (they would have > gotten a SIGSEGV and died). > > Anyhow, my "lame undergrads" example comes from a time when PCs weren't > really powerful enough for the job (or something; anyhow, we didn't have > any in the department :-). My example is from a Sequent Balance (16 > ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant). So, tell me... when NetBSD gets it's non-overcommit switch, would you use it in the environment you describe? -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
John Nemeth wrote: > > The machine in question has run out of swap, due to unforseeable > excessive memory demands. This was accompanied by processes > complaining about not being able to allocate memory and then cleaning > up after themselves. I did not see random processes being killed > because of it. That is the way things should be. From this, I can > assume that the OS doesn't overcommit. In case, you're wondering, the > OS in question is: > > SunOS 5.5 Generic_103093-25 sun4u sparc Uh... like any modern unix, Solaris overcommits. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Michael Richardson wrote: > > Ben> Tell me, Mr. Nemeth, has this ever happened to you? Have you ever > Ben> come *close*? > > Uh, since we don't run overcommit, the answer is specifically *NO*. And what system do you run? > I have had it happen on other systems. (Solaris, AIX) It was very > mystifying to diagnose. Sure, the systems were misconfigured for what we > were trying to do, but if I wanted build a custom system for every > application well... I'd be running NT. I have to agree about the mystifying diagnose... Specially when they *don't* page like hell. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Michael Richardson wrote: > > No, I don't agree. > > This is a biggest argument against solving the overcommit situation with > SIGKILL. I have no problem with overcommit as a concept, I have a problem > with being unable to keep my possibly big processes (X, rpc.nisd, > etc. depending on cicumstances) from being victims. It is no more difficult to protect big processes than it is to create user limits. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
John Nemeth wrote: > > On one system I administrate, the largest process is typically > rpc.nisd (the NIS+ server daemon). Killing that process would be a > bad thing (TM). You're talking about killing random processes. This > is no way to run a system. It is not possible for any arbitrary > decision to always hit the correct process. That is a decision that > must be made by a competent admin. This is the biggest argument > against overcommit: there is no way to gracefully recover from an > out of memory situation, and that makes for an unreliable system. If you run out of memory, it is either a misconfigured system, or a runaway program. If a program is runaway, then: 1) It is larger than your typical rpc.nisd. 2) You cannot tell the system a priori to kill it, because you don't know about it (or else, you wouldn't be running it in first place). A system running in overcommit assumes that you have it correctly configured so it will *not* run out of memory under normal conditions. This happens to be the same assumption Unix does. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
> > And what do you do, then, with the processes that happen to have > legitimate use for more stack? > > Or maybe you just find out how much stack each process uses, and > then set limits appropriate for each one? Which is the equivalent of > setting limits to each user, of course... You get a little program, like eg. Xenix and Minix had, which lets you modify the executable header to indicate how much stack the system should reserve. If the program decides to use more stack for some reason, then it dies; this is in effect "stack overcommit". 8) -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ [EMAIL PROTECTED] \\-- Joseph Merrick \\ [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
: :On Jul 15, 12:20am, "Daniel C. Sobral" wrote: :} "Charles M. Hannum" wrote: :} > :} > That's also objectively false. Most such environments I've had :} > experience with are, in fact, multi-user systems. As you've pointed :} > out yourself, there is no combination of resource limits and whatnot :} > that are guaranteed to prevent `crashing' a multi-user system due to :} > overcommit. My simulation should not be axed because of a bug in :} > someone else's program. (This is also not hypothetical. There was a :} > bug in one version of bash that caused it to consume all the memory it :} > could and then fall over.) :} :} In which case the program that consumed all memory will be killed. :} The program killed is +NOT+ the one demanding memory, it's the one :} with most of it. : : On one system I administrate, the largest process is typically :rpc.nisd (the NIS+ server daemon). Killing that process would be a :bad thing (TM). You're talking about killing random processes. This :is no way to run a system. It is not possible for any arbitrary :decision to always hit the correct process. That is a decision that :must be made by a competent admin. This is the biggest argument :against overcommit: there is no way to gracefully recover from an :out of memory situation, and that makes for an unreliable system. : :}-- End of excerpt from "Daniel C. Sobral" ... and the chance of that system running out of swap space is? The machine has hit the wall, the admin can't login. What is the kernel to do? -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Replacement for grep(1) (part 2)
:> I mean, jeeze, the reservation for the program stack alone would eat :> up all your available swap space! What is a reasonable stack size? The :> system defaults to 8MB. Do we rewrite every program to specify its own :> stack size? How do we account for architectural differences? : :The alternative is to rewrite every program that assumes the semantics :of malloc() are being followed. The problem I have as an applications :writer is that I tend to believe malloc. To pick a specific example, :our IMAP client takes steps to ensure it won't run out of memory in :critical sections. We maintain a "rainy day" pool block of memory. If :... :--lyndon We just put a cap on the number of imap clients we allow running at any given moment... say, a few hundred. Not only does it work just dandy, it also prevents the machine from overloading and gives us a nice "you may want to look into this" alarm. We do the same thing with sendmail, popper, the web server, named, and every other service which can be forked. This also prevents one subsystem from overly interfering with another. For example, if popper saturates it does not overly interfere with imapd operation. The limit is set to around 3x the monday peak load, and sufficient swap is configured to deal with it should the limit be hit. Problem solved. No fancy modifications required. If any of these subsystems actually ever got close to using all available swap, the other subsystems would be up the creek anyway so, really, it doesn't make much sense hacking the source to allow the subsystem to run into the wall anyway. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
:Our IMAP server routinely show a footprint of about 1MB private storage. :This is constant for most operations. However, when you get into doing :SEARCH and SORT, there are certain cases where we need memory, sometimes :a *lot* of memory. : :Your proposal is that my *well behaved* application should be arbitrarily :killed, leaving the client stuck with a) no results and b) no IMAP :connection, in this situation. (And think threaded. That one server :could be handling *hundreds* of clients.) This is preferable to :returning a NULL to the malloc() request, which I can handle :gracefully by simply returning a NO response to the IMAP client? : :What it so evil about having a reasonably intelligent malloc() that :tells the truth, and returns unused memory to the system? Overcommit :is for lazy programmers, plain and simple. At least the SGI documentation :about overcommit admits that (or at least, did at one time). : :--lyndon If you are running an IMAP server that regularly runs out of swap space, you have a configuration problem which needs to be addressed. It's as simple as that. What you are putting forth is an example of something that will never happen on a properly configured server. In regards to the general case where one is running third-party applications. Here you are assuming that you can go in and modify every single piece of software running on the machine to deal with malloc() returning NULL. Because if you don't, the machine isn't going to be very stable. Not only that, you are assuming that you will make the correct decision on what action to take when malloc() *does* return NULL. If you decide to return an error code but not exit, what happens when a potential blowup situation results in thousands of imap processes being run on the system, and NONE of them exit when their malloc() fails? The problem is a whole lot more complex then simply having the OS return NULL from a malloc(). Currently the OS kills processes as a last resort. The idea is that no nominally running system runs out of swap. Now you propose to take away the kernel's ability to recover some memory as a last resort and instead put it into the hands of the very user or root-run processes that are causing the problem in the first place! A much better solution would be to write a simple watchdog script that notices when swap space is low and does the right thing -- e.g. kills the non-essential processes and leaves the essential ones alone. Then the kernel never actually reaches a state of last-resort. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Sergey Babkin wrote: > > > It would be nice to have a way to indicate that, a la SIGDANGER. > > Another option may be to add something like "importance classes". > Suppose we assign an one-byte "importance level" to each process. > When we get out of swap we start killing processes with the lowest > importance level. This seems to be both easy to implement and > a rather robust solution. This is as easy to do as setting limits, which has the added benefit of not having any process killed. -- Daniel C. Sobral(8-DCS) d...@newsguy.com d...@freebsd.org "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
Jason Thorpe wrote: > > If you have a lot of users, all of which have buggy programs which eat > a lot of memory, per-user swap quotas don't necessarily save your butt. The chance of these buggy programs running at the same time is not exactly high... > And maybe the individual programs didn't encounter their resource limits. > > ...but the sheer number of these runaway things caused the overcommit to > be a problem. If malloc() or whatever had actually returned NULL at the > right time (i.e. as backing store was about to become overcommitted), then > these runaway processes would have stopped running away (they would have > gotten a SIGSEGV and died). > > Anyhow, my "lame undergrads" example comes from a time when PCs weren't > really powerful enough for the job (or something; anyhow, we didn't have > any in the department :-). My example is from a Sequent Balance (16 > ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant). So, tell me... when NetBSD gets it's non-overcommit switch, would you use it in the environment you describe? -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))
[EMAIL PROTECTED] wrote: > > What it so evil about having a reasonably intelligent malloc() that > tells the truth, and returns unused memory to the system? Overcommit > is for lazy programmers, plain and simple. At least the SGI documentation > about overcommit admits that (or at least, did at one time). Yes. So is high-level languages, as a matter of fact. True memory-conscious programmers will never use anything besides assembler. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] "Would you like to go out with me?" "I'd love to." "Oh, well, n... err... would you?... ahh... huh... what do I do next?" To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message