Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon

:
:It results sometimes in out of swap, too.
:
:> Inetd is rate-limited by default nowadays, so this really doesn't apply.
:
:It really does apply. Inetd limits incoming connections per minute, not per
:second. It is possible to use minute limit in a few seconds and cause a high
:load. Sendmail is worse than inetd; it cannot limit incoming rate on
:
:Netch

You can specify a maximum fork limit for inetd on a per-service basis.

You are a year or two too late on these things.  A great many improvements
have been made to programs like sendmail and inetd explicitly to deal 
with overload situations.  Web servers too.  These were fairly simple
changes as well.  For sendmail it was as simple as making MaxDaemonChildren
apply to queue runs - I submitted that one to Eric Allman two years ago
and it's been a part of sendmail since then.  For inetd it is the -c, -C,
and -R options (which can be specified on a per-service basis as well).
Dima and I added the -R option back in 1997 specifically to help with
DOS attacks.

Sendmail is not an issue when properly configured.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon
:> machine.  In this case the overcommit that can occur is with I/O, not
:> swap.  As a general performance rule, you have to set MaxDaemonChildren
:> and MaxArticleSize to prevent the overcommit from occuring.  This is a
:> function of sendmail, not a function of the kernel.
:
:Sigh. ((c)you) Sendmail can overcommit a machine with right set of
:MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen
:such situations. MaxDaemonChildren limits only number of main processes for
:incoming connections (plus queue run processes). For each connection, after
:"main from:" and until accepting message, server process for incoming
:connection forks child which accepts recipient list and letter body. After
:message accepting, that child can fork delivery process. A queue run process
:with "O ForkEachJob=true" option, which is default, can create a delivery
:process for each queue job (in my practice, queue of more than 1000 jobs is
:--
:Netch

Actually this isn't true.  QueueLA & RefuseLA tend to be useless options
with sendmail.  MaxDaemonChildren, on the otherhand, tends to be a
very useful option.

By running the daemon and the queue separately, and putting the daemon
in queue-only mode, sendmail has virtually no chance of taking down the
machine.  Example (assuming a box w/256MB of ram):

sendmail -bd -O MaxDaemonChildren=130 -O DeliveryMode=queue
sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=30m
sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=2h

This is what we do at BEST.  Once we began doing things this this way, 
our three (continuously loaded) frontend mail machines never bogged down
ever again. 

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon


:
:It results sometimes in out of swap, too.
:
:> Inetd is rate-limited by default nowadays, so this really doesn't apply.
:
:It really does apply. Inetd limits incoming connections per minute, not per
:second. It is possible to use minute limit in a few seconds and cause a high
:load. Sendmail is worse than inetd; it cannot limit incoming rate on
:
:Netch

You can specify a maximum fork limit for inetd on a per-service basis.

You are a year or two too late on these things.  A great many improvements
have been made to programs like sendmail and inetd explicitly to deal 
with overload situations.  Web servers too.  These were fairly simple
changes as well.  For sendmail it was as simple as making MaxDaemonChildren
apply to queue runs - I submitted that one to Eric Allman two years ago
and it's been a part of sendmail since then.  For inetd it is the -c, -C,
and -R options (which can be specified on a per-service basis as well).
Dima and I added the -R option back in 1997 specifically to help with
DOS attacks.

Sendmail is not an issue when properly configured.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon

:> machine.  In this case the overcommit that can occur is with I/O, not
:> swap.  As a general performance rule, you have to set MaxDaemonChildren
:> and MaxArticleSize to prevent the overcommit from occuring.  This is a
:> function of sendmail, not a function of the kernel.
:
:Sigh. ((c)you) Sendmail can overcommit a machine with right set of
:MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen
:such situations. MaxDaemonChildren limits only number of main processes for
:incoming connections (plus queue run processes). For each connection, after
:"main from:" and until accepting message, server process for incoming
:connection forks child which accepts recipient list and letter body. After
:message accepting, that child can fork delivery process. A queue run process
:with "O ForkEachJob=true" option, which is default, can create a delivery
:process for each queue job (in my practice, queue of more than 1000 jobs is
:--
:Netch

Actually this isn't true.  QueueLA & RefuseLA tend to be useless options
with sendmail.  MaxDaemonChildren, on the otherhand, tends to be a
very useful option.

By running the daemon and the queue separately, and putting the daemon
in queue-only mode, sendmail has virtually no chance of taking down the
machine.  Example (assuming a box w/256MB of ram):

sendmail -bd -O MaxDaemonChildren=130 -O DeliveryMode=queue
sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=30m
sendmail -q1m -O MaxDaemonChildren=40 -O MinQueueAge=2h

This is what we do at BEST.  Once we began doing things this this way, 
our three (continuously loaded) frontend mail machines never bogged down
ever again. 

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev
Matthew Dillon wrote:

> Give me a shell and I can crash any machine.

Oh. ;|

> A good example of this is sendmail.  Before the MaxDaemonChildren and
> MaxArticleSize options, it was possible for sendmail to overcommit a
> machine.  In this case the overcommit that can occur is with I/O, not
> swap.  As a general performance rule, you have to set MaxDaemonChildren
> and MaxArticleSize to prevent the overcommit from occuring.  This is a
> function of sendmail, not a function of the kernel.

Sigh. ((c)you) Sendmail can overcommit a machine with right set of
MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen
such situations. MaxDaemonChildren limits only number of main processes for
incoming connections (plus queue run processes). For each connection, after
"main from:" and until accepting message, server process for incoming
connection forks child which accepts recipient list and letter body. After
message accepting, that child can fork delivery process. A queue run process
with "O ForkEachJob=true" option, which is default, can create a delivery
process for each queue job (in my practice, queue of more than 1000 jobs is
ordinary event). All these forks depend only on one test - get current LA
and compare it with QueueLA - which fail when high load appeared less than
one minute ago. To prevent its overcommit, (I interfere in details with
parallel message) the minimal (and possibly not enough) setup set is:
1) patch - insert sm_sleep(1) to server subprocess code before "accepted"
reply - limit incoming mail rate;
2) Desrease QueueLA for listening daemon to sub-minimal value
(i.e.2);
3) Increase QueueLA for queue running daemon to high values (i.e.50) and set
them OForkEachJob=false.

But most of these tunings are indirect. A direct tuning invented
experimentally on my mail servers is specially hacked pstat program that
returns 1 if either swap or file descriptors are used more than 2/3, 0
otherwise; on getting 1, sendmail stops delivering. But, it's pity, this
check is unportable.

(P.S. Don't tell me change MTA; this is fully another question.)

> Another good example is a web server.  A web server must have specific
> limitations on the number of simultanious connections it is allowed
> to handle at once and on the number of CGI's or other auxillary programs
> that are allowed to be running at any given time.  The overcommit issue
> here has nothing to do with swap and everything to do with performance.
> Specifically, these limitations exist to avoid cascade failures.

As in sendmail case, you propose make some calculations (which are difficult
and non-trivial to newbies) to make appreciations of nesessary resources.
Another way, which is imho more acceptable, is to provide not hard barriers
(SIGKILL on overcommitting), but soft barriers (i.e., stop memory allocating
for non-wheel users when memory begins to exhaust). Extra 64M of memory or a
disk for swap is commonly quite more cheaper than profitloss on critical
service crash.

> In the same manner any truely critical system server must handle the
> resource management itself to deal with all sorts of problem situations,
> including memory.  You do not need to build any of this control into the
> kernel.

No, we need it. Not every server can be patched for such tests (due to loss
of sources or another reason), not every admin can make nesessary patches.
Kernel must help in it.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev

Matthew Dillon wrote:

> Give me a shell and I can crash any machine.

Oh. ;|

> A good example of this is sendmail.  Before the MaxDaemonChildren and
> MaxArticleSize options, it was possible for sendmail to overcommit a
> machine.  In this case the overcommit that can occur is with I/O, not
> swap.  As a general performance rule, you have to set MaxDaemonChildren
> and MaxArticleSize to prevent the overcommit from occuring.  This is a
> function of sendmail, not a function of the kernel.

Sigh. ((c)you) Sendmail can overcommit a machine with right set of
MaxDaemonChildren, MaxArticleSize, QueueLA & RefuseLA options - I have seen
such situations. MaxDaemonChildren limits only number of main processes for
incoming connections (plus queue run processes). For each connection, after
"main from:" and until accepting message, server process for incoming
connection forks child which accepts recipient list and letter body. After
message accepting, that child can fork delivery process. A queue run process
with "O ForkEachJob=true" option, which is default, can create a delivery
process for each queue job (in my practice, queue of more than 1000 jobs is
ordinary event). All these forks depend only on one test - get current LA
and compare it with QueueLA - which fail when high load appeared less than
one minute ago. To prevent its overcommit, (I interfere in details with
parallel message) the minimal (and possibly not enough) setup set is:
1) patch - insert sm_sleep(1) to server subprocess code before "accepted"
reply - limit incoming mail rate;
2) Desrease QueueLA for listening daemon to sub-minimal value
(i.e.2);
3) Increase QueueLA for queue running daemon to high values (i.e.50) and set
them OForkEachJob=false.

But most of these tunings are indirect. A direct tuning invented
experimentally on my mail servers is specially hacked pstat program that
returns 1 if either swap or file descriptors are used more than 2/3, 0
otherwise; on getting 1, sendmail stops delivering. But, it's pity, this
check is unportable.

(P.S. Don't tell me change MTA; this is fully another question.)

> Another good example is a web server.  A web server must have specific
> limitations on the number of simultanious connections it is allowed
> to handle at once and on the number of CGI's or other auxillary programs
> that are allowed to be running at any given time.  The overcommit issue
> here has nothing to do with swap and everything to do with performance.
> Specifically, these limitations exist to avoid cascade failures.

As in sendmail case, you propose make some calculations (which are difficult
and non-trivial to newbies) to make appreciations of nesessary resources.
Another way, which is imho more acceptable, is to provide not hard barriers
(SIGKILL on overcommitting), but soft barriers (i.e., stop memory allocating
for non-wheel users when memory begins to exhaust). Extra 64M of memory or a
disk for swap is commonly quite more cheaper than profitloss on critical
service crash.

> In the same manner any truely critical system server must handle the
> resource management itself to deal with all sorts of problem situations,
> including memory.  You do not need to build any of this control into the
> kernel.

No, we need it. Not every server can be patched for such tests (due to loss
of sources or another reason), not every admin can make nesessary patches.
Kernel must help in it.

--
Netch




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev
Brian F. Feldman wrote:

>> There are other ways.  For example, even if a user account is resource
>> limited, root processes (such as sendmail, popper, identd, and so forth)
>> are not.  Attacks against these servers generally result in very high
>> loads and sometimes make it difficult to login to fix the problem, but do
>> not result in running out of swap.

It results sometimes in out of swap, too.

> Inetd is rate-limited by default nowadays, so this really doesn't apply.

It really does apply. Inetd limits incoming connections per minute, not per
second. It is possible to use minute limit in a few seconds and cause a high
load. Sendmail is worse than inetd; it cannot limit incoming rate on
established connection. Butenko's (bute...@stalker.com) DoS attack to
sendmail is to send thousands of letters to local user thru fast
netork connection (i.e., Ethernet) thru one established TCP connection; the
only barrier is testing of LA before sending '250 XXX message accepted to
delivery' reply and fork-and-deliver-or-queue-and-exit decision, but
attacker can send too many letters in few seconds; a hundreds of delivery
processes locked on /usr/libexec/mail.local mailbox waiting. LA counts
system state characteristics of last minute and thus is similar to average
patients' temperature per hospital per last year. ;( I have seen a variant
of this attack on my mail hosts, when host with 6000 letters in mail queue
(mail2news server) sent all its mail to smarthost (uucp spool server); after
~500 letters, sendmail on smarthost closed port 25 on RefuseLA; it was saved
from out-of-swap only because domain resolving spent some time. The only
mechanism against such type of attack I can imagine is to sm_sleep(1) at
"mail from:" smtp server code or before '250 Message accepted for delivery'.
For inetd, we must limit connections per second, not per minute.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev

Brian F. Feldman wrote:

>> There are other ways.  For example, even if a user account is resource
>> limited, root processes (such as sendmail, popper, identd, and so forth)
>> are not.  Attacks against these servers generally result in very high
>> loads and sometimes make it difficult to login to fix the problem, but do
>> not result in running out of swap.

It results sometimes in out of swap, too.

> Inetd is rate-limited by default nowadays, so this really doesn't apply.

It really does apply. Inetd limits incoming connections per minute, not per
second. It is possible to use minute limit in a few seconds and cause a high
load. Sendmail is worse than inetd; it cannot limit incoming rate on
established connection. Butenko's ([EMAIL PROTECTED]) DoS attack to
sendmail is to send thousands of letters to local user thru fast
netork connection (i.e., Ethernet) thru one established TCP connection; the
only barrier is testing of LA before sending '250 XXX message accepted to
delivery' reply and fork-and-deliver-or-queue-and-exit decision, but
attacker can send too many letters in few seconds; a hundreds of delivery
processes locked on /usr/libexec/mail.local mailbox waiting. LA counts
system state characteristics of last minute and thus is similar to average
patients' temperature per hospital per last year. ;( I have seen a variant
of this attack on my mail hosts, when host with 6000 letters in mail queue
(mail2news server) sent all its mail to smarthost (uucp spool server); after
~500 letters, sendmail on smarthost closed port 25 on RefuseLA; it was saved
from out-of-swap only because domain resolving spent some time. The only
mechanism against such type of attack I can imagine is to sm_sleep(1) at
"mail from:" smtp server code or before '250 Message accepted for delivery'.
For inetd, we must limit connections per second, not per minute.

--
Netch




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt
On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

> Technical follow-up:
> 
> Contrary to what I previously said, a number of tests reveal that
> Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt

On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

> Technical follow-up:
> 
> Contrary to what I previously said, a number of tests reveal that
> Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Brian F. Feldman
Can we kill this thread already? This resolves nothing. The only good
to come of this is all of the nice doc-proj input Matt is providing
(and providing well, I might add.)

There is no point that hasn't been rehashed a dozen times over, and
you (the ones who want overcommitting turned off) are not helping
the S/N ratio.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 gr...@freebsd.org   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon
:> I'm sorry, but when you write code for a safety related system you
:> do not dynamically allocate memory at all.  It's all essentially static.
:> There is no issue with the memory resource.  Besides, none of the BSD's 
are
:> certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 


:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:eisc...@vigrid.com
:



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Brian F. Feldman

Can we kill this thread already? This resolves nothing. The only good
to come of this is all of the nice doc-proj input Matt is providing
(and providing well, I might add.)

There is no point that hasn't been rehashed a dozen times over, and
you (the ones who want overcommitting turned off) are not helping
the S/N ratio.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen
> I'm sorry, but when you write code for a safety related system you
> do not dynamically allocate memory at all.  It's all essentially static.
> There is no issue with the memory resource.  Besides, none of the BSD's 
> are
> certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
eisc...@vigrid.com


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn
On Fri, 16 Jul 1999, Matthew Dillon wrote:

>
>:  Well, NetBSD is slated to be used in the 'Space Acceleration
>:  Measurement System II', measuring the microgravity environment on
>:  the International Space Station using a distributed system based
>:  on several NetBSD/i386 boxes.
>:
>:  Sometimes your 'what-if' senarios are others' standard operating
>:  procedures.
>:
>:  David/absolute
>:
>:   What _is_, what _should be_, and what _could be_ are all distinct.
>
>Ummm... this doesn't sound like a critical system to me.  It sounds like
>an experiment.
>

It's probably an awfully expensive experiment (putting things into space
is not cheap)


Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

:   Well, NetBSD is slated to be used in the 'Space Acceleration
:   Measurement System II', measuring the microgravity environment on
:   the International Space Station using a distributed system based
:   on several NetBSD/i386 boxes.
:
:   Sometimes your 'what-if' senarios are others' standard operating
:   procedures.
:
:   David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.

None of the BSD's (nor NT, nor any other complex general purpose operating
system) are certified for critical systems in space.  The reason is
simple:  None of these operating systems can deal with memory faults 
caused by radiation.  You might see it for internal communications or
non-critical sensing, but you aren't going to see it for external
communications or thruster control.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

:> I'm sorry, but when you write code for a safety related system you
:> do not dynamically allocate memory at all.  It's all essentially static.
:> There is no issue with the memory resource.  Besides, none of the BSD's are
:> certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>

:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:[EMAIL PROTECTED]
:



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee
On Fri, 16 Jul 1999, Matthew Dillon wrote:

> I'm sorry, but when you write code for a safety related system you
> do not dynamically allocate memory at all.  It's all essentially static.
> There is no issue with the memory resource.  Besides, none of the BSD's 
> are
> certified for any of that stuff that I know of.
> 
> What's next:  A space shot?  These what-if scenarios are getting
> ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen

> I'm sorry, but when you write code for a safety related system you
> do not dynamically allocate memory at all.  It's all essentially static.
> There is no issue with the memory resource.  Besides, none of the BSD's are
> certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn

On Fri, 16 Jul 1999, Matthew Dillon wrote:

>
>:  Well, NetBSD is slated to be used in the 'Space Acceleration
>:  Measurement System II', measuring the microgravity environment on
>:  the International Space Station using a distributed system based
>:  on several NetBSD/i386 boxes.
>:
>:  Sometimes your 'what-if' senarios are others' standard operating
>:  procedures.
>:
>:  David/absolute
>:
>:   What _is_, what _should be_, and what _could be_ are all distinct.
>
>Ummm... this doesn't sound like a critical system to me.  It sounds like
>an experiment.
>

It's probably an awfully expensive experiment (putting things into space
is not cheap)

>From a financial viewpoint that may be considered critical.

Cheers,

Al


--
Alan Horn - Sysadmin - Dreamworks (+1 818 695 6256) - [EMAIL PROTECTED]
  I am Connor MacLeod of the Clan MacLeod. I was born in 1518 in the
village of Glenfinnan on the shores of Loch Sheil, and I am immortal.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


:   Well, NetBSD is slated to be used in the 'Space Acceleration
:   Measurement System II', measuring the microgravity environment on
:   the International Space Station using a distributed system based
:   on several NetBSD/i386 boxes.
:
:   Sometimes your 'what-if' senarios are others' standard operating
:   procedures.
:
:   David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.

None of the BSD's (nor NT, nor any other complex general purpose operating
system) are certified for critical systems in space.  The reason is
simple:  None of these operating systems can deal with memory faults 
caused by radiation.  You might see it for internal communications or
non-critical sensing, but you aren't going to see it for external
communications or thruster control.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee

On Fri, 16 Jul 1999, Matthew Dillon wrote:

> I'm sorry, but when you write code for a safety related system you
> do not dynamically allocate memory at all.  It's all essentially static.
> There is no issue with the memory resource.  Besides, none of the BSD's are
> certified for any of that stuff that I know of.
> 
> What's next:  A space shot?  These what-if scenarios are getting
> ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham


"Daniel C. Sobral" wrote:

> > It would be nice to have a way to indicate that, a la SIGDANGER.
> 
> Ok, everybody is avoiding this, so I'll comment. Yes, this would be
> interesting, and a good implementation will very probably be
> committed. *BUT*, this is not as useful as it seems. Since the
> correct solution is buy more memory/increase swap (correct solution
> for our target markets, anyway), there is little incentive to
> implement it.
> 
> So, I think people who can answer the above is thinking like "Well,
> it is useful, but it's not useful enough for me to spend my time on
> it, and I'm sure as hell don't want to write mini-papers on why it's
> not that useful".
> 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham



"Daniel C. Sobral" wrote:

> > It would be nice to have a way to indicate that, a la SIGDANGER.
> 
> Ok, everybody is avoiding this, so I'll comment. Yes, this would be
> interesting, and a good implementation will very probably be
> committed. *BUT*, this is not as useful as it seems. Since the
> correct solution is buy more memory/increase swap (correct solution
> for our target markets, anyway), there is little incentive to
> implement it.
> 
> So, I think people who can answer the above is thinking like "Well,
> it is useful, but it's not useful enough for me to spend my time on
> it, and I'm sure as hell don't want to write mini-papers on why it's
> not that useful".
> 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev
Daniel C. Sobral wrote:

> > 4.4BSD derived system cannot do this, and have to use different
> > machine for such applications.
>
> Incorrect. We can set *limits* to the users, so they won't be able
> to crash down the system.

No. Really, not all users are used system in the same time. And it is too
cruel to set too small limits. And, average system has user limits quite
more than (total_resource*2/3)/n_users (2/3 is sub-optimal modifier). But,
if too many users began to use system, they can overflow the resource.
Group limits can make problem softer, but not more than a little.

I don't remember now English word for soft barrier, the Russian word is
'dempfer' ;) System must provide such soft barrier to prevent overflow long
far from the real overflow. Imho, 20% of typical critical resource must be
prevented.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral
Patrick Welche wrote:
> 
> students != hostile users

We obviously have known different students... :-)

> Making mistakes is part of learning.

A hostile user is one which will act in a non-friendly manner.
Whether intentionaly or not is irrelevant from the point of view of
the administrator, as far as protecting the system goes.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Patrick Welche
Matthew Dillon wrote:
> 
> :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
> : John Baldwin  wrote:
> :
> : > What does that have to do with overcommit?  I student administrate a 
> undergrad
> : > CS lab at a university, and when student's programs misbehaved, they 
> generate a
> : > fault and are killed.  The only machines that reboot on us without be
> : > explicitly told to are the NT ones, and yes we run FreeBSD.
> :
> :What does it have to do with overcommit?  Everthing in the world!
> :
> :If you have a lot of users, all of which have buggy programs which eat
> :a lot of memory, per-user swap quotas don't necessarily save your butt.
> 
> If every single one of your users is trying to crash your machine daily,
> maybe you should consider throwing them off the system and finding users
> that are less hostile.
> 
> This conversation is getting silly.  Do you actually believe that
> an operating system can magically protect itself 100% from armloads of 
> hostile users?
> 
> Give me a break.  You people are crazy.  If you have something worthwhile
> to say i'll listen, but these "the sky is falling!" arguments are idiotic.
> 
>   -Matt
> 

students != hostile users

Making mistakes is part of learning.

Patrick


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen

c...@netbsd.org (Chris G. Demetriou) writes:

> Matthew Dillon  writes:
> > The text size of a program is irrelevant, because swap is never
> > allocated for it.  The data and BSS are only relevant when they

No, you can mprotect read-only vnode mappings to writable.  Most
things wouldn't be hurt badly if this changed, though, I suspect that
this already varies between operating systems.

> > are modified.
> > 
> > The only thing swap is ever used for is the dynamic allocation of 
> > memory.
> > There are three ways to do it:  sbrk(), mmap(... MAP_ANON), or
> > mmap(... MAP_PRIVATE).

> yup, almost: not all MAP_PRIVATE mappings need backing store, only
> MAP_PRIVATE and writeable mappings.  (MAP_PRIVATE does _not_ guarantee
> that you won't see modifications made via other MAP_SHARED mappings.)

...but in *this* case, you certainly shouldn't allow mprotect to fail
(with what, ENOMEM?).

It's certainly counterintuitive to me that mprotect could fail due to
a resource shortage.

> Actually, only now have you brought that up.  And, that's very system
> dependent.  On NetBSD/i386 the default is 2MB, and, it's worth noting
> that you only need to reserve as much as the current stack limit
> allows (after that, you're going to get a signal anyway, and if more

So what setrlimit accepts depends on how much memory is available?

Ok, programs changing their stack limit are rare, but this would still
be another API change.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev

Daniel C. Sobral wrote:

> > 4.4BSD derived system cannot do this, and have to use different
> > machine for such applications.
>
> Incorrect. We can set *limits* to the users, so they won't be able
> to crash down the system.

No. Really, not all users are used system in the same time. And it is too
cruel to set too small limits. And, average system has user limits quite
more than (total_resource*2/3)/n_users (2/3 is sub-optimal modifier). But,
if too many users began to use system, they can overflow the resource.
Group limits can make problem softer, but not more than a little.

I don't remember now English word for soft barrier, the Russian word is
'dempfer' ;) System must provide such soft barrier to prevent overflow long
far from the real overflow. Imho, 20% of typical critical resource must be
prevented.

--
Netch




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen

jul...@whistle.com (Julian Elischer) writes:

> If you wanted to fix this, you could add a patch to malloc that touched
> every page that it handed to the application. (and trapped sig11s)

How would you expect that to work?

Several misunderstandings seem to be common regarding this issue (most
not directed at you):

 - malloc almost never fails with NULL.  This is not true, if resource
limits are set properly, any one program using huge amounts of memory
is going to hit them long before swap space is exhausted.

 - The program currently trying to get the page is the one that is
killed.

 - Actually paging in all memory is going to protect a program from
getting killed.  This is going to make it *more likely* for it to be
killed.

 - Not overcommitting doesn't consume huge amounts of reserve space
unless programs do something special.

A rough sum of memory usage can be computed by summing up all of the
process VSZs plus your stack limit times the number of processes.  How
many of you would be willing to configure that much swap space?

If you really wanted to run without overcommit, you'd only run
statically linked binaries and set your stack limits to small values.
This could be desirable for some (but not general-purpose) systems, an
option for doing this wouldn't be entirely bogus.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev
Daniel C. Sobral wrote:

> Eh? Reasonable programs *never* run into trouble. Trouble only
> happens when you have unreasonable programs around, or did not
> configure the system correctly. And if you did not configure the
> system correctly, why do you think you would be able to correctly
> estimate the stack needed for the various programs?

Your words are bad words. Exhausting of any of main resources - virtual
memory, disk space, process descriptors, file descriptors - is a terrible
situation, but one must not fight against headache with headcutting.
Every system can fall in uncontrolled state and eat all of some resource,
and kernel stack is to prevent process pool part from this, not to destruct
it. I had seen two boxes where swap was out misfortunately with bad results:
on first (FreeBSD 2.2.7), system kills the cron (sic!) process, on second
(Linux) syslogd, sendmail and some others became poisoned without any
warnings. It is totally bad behavior; kernel must be friend, not enemy.

Actions supposed enough by me for first (!) time:
1) Count in some kernel variables (readable by sysctl) overflows of virtual
memory, file descriptors, process descriptors and other critical resources.
This data must be available for watchdogs; for some systems, it is right to
reboot them immediately after some overflow, not to try to work in poisoned
state.
2) Run (in standard setup!) cron, syslogd and other important daemons from
special init slot (as Linux and possibly other systems allow), not from
startup scripts. Reason: they must be restarted when die without admin
intervention and without wrappers which can also be killed on memory low.
3) Declare thresholds for critical resources; for example, when more than
80% of virtual memory is used, prevent everybody except euid==0 or egid==0
from allocating new memory.
4) Provide special signal (SIGXMEM?) to send messages that there is memory
low and all have to shorten their memory. Daemons should interpret this
signal similarly to SIGHUP, with exec() itself and restart.

> Now comes the people saying "don't overcommit in *this* case, and
> overcommit in *that* case". Irrelevant. Programs are still getting
> killed because memory was overcommitted (with the added disadvantage
> of you not having as much memory as in a full overcommit mode).

Kernel can kill processes that try to get unexistent memory. But when it did
not prevent system from falling into overflow, it plays unfair game.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral
Matthew Dillon wrote:
> 
> Something is weird here.  If the solaris people are using a
> SWAPSIZE + REALMEM VM model, they have to allow the
> allocated + reserved space go +REALMEM bytes over available swap
> space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the "swap -s" command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the "swap -s" command.

> Wait - does Solaris normally use swap files or swap partitions?
> Or is it that weird /tmp filesystem stuff?  If it normally uses swap
> files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
c...@comp.cs.gunma-u.ac.jp, though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral

Patrick Welche wrote:
> 
> students != hostile users

We obviously have known different students... :-)

> Making mistakes is part of learning.

A hostile user is one which will act in a non-friendly manner.
Whether intentionaly or not is irrelevant from the point of view of
the administrator, as far as protecting the system goes.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Patrick Welche

Matthew Dillon wrote:
> 
> :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
> : John Baldwin <[EMAIL PROTECTED]> wrote:
> :
> : > What does that have to do with overcommit?  I student administrate a undergrad
> : > CS lab at a university, and when student's programs misbehaved, they generate a
> : > fault and are killed.  The only machines that reboot on us without be
> : > explicitly told to are the NT ones, and yes we run FreeBSD.
> :
> :What does it have to do with overcommit?  Everthing in the world!
> :
> :If you have a lot of users, all of which have buggy programs which eat
> :a lot of memory, per-user swap quotas don't necessarily save your butt.
> 
> If every single one of your users is trying to crash your machine daily,
> maybe you should consider throwing them off the system and finding users
> that are less hostile.
> 
> This conversation is getting silly.  Do you actually believe that
> an operating system can magically protect itself 100% from armloads of 
> hostile users?
> 
> Give me a break.  You people are crazy.  If you have something worthwhile
> to say i'll listen, but these "the sky is falling!" arguments are idiotic.
> 
>   -Matt
> 

students != hostile users

Making mistakes is part of learning.

Patrick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen


[EMAIL PROTECTED] (Chris G. Demetriou) writes:

> Matthew Dillon <[EMAIL PROTECTED]> writes:
> > The text size of a program is irrelevant, because swap is never
> > allocated for it.  The data and BSS are only relevant when they

No, you can mprotect read-only vnode mappings to writable.  Most
things wouldn't be hurt badly if this changed, though, I suspect that
this already varies between operating systems.

> > are modified.
> > 
> > The only thing swap is ever used for is the dynamic allocation of memory.
> > There are three ways to do it:  sbrk(), mmap(... MAP_ANON), or
> > mmap(... MAP_PRIVATE).

> yup, almost: not all MAP_PRIVATE mappings need backing store, only
> MAP_PRIVATE and writeable mappings.  (MAP_PRIVATE does _not_ guarantee
> that you won't see modifications made via other MAP_SHARED mappings.)

...but in *this* case, you certainly shouldn't allow mprotect to fail
(with what, ENOMEM?).

It's certainly counterintuitive to me that mprotect could fail due to
a resource shortage.

> Actually, only now have you brought that up.  And, that's very system
> dependent.  On NetBSD/i386 the default is 2MB, and, it's worth noting
> that you only need to reserve as much as the current stack limit
> allows (after that, you're going to get a signal anyway, and if more

So what setrlimit accepts depends on how much memory is available?

Ok, programs changing their stack limit are rare, but this would still
be another API change.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi

[cc: list trimmed]

On Thu, 15 Jul 1999 lyn...@orthanc.ab.ca wrote:

> > In that scenario, the 512MB of swap I assigned to this machine would be
> > dangerously low.
> 
> With 13GB disks available for a couple of hundred bucks, my machines aren't
> going to run out of swap space any time soon, even if I commit to disk.
> 
> All I want for Christmas is a knob to disable overcommit.
> 
> --lyndon
> 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen


[EMAIL PROTECTED] (Julian Elischer) writes:

> If you wanted to fix this, you could add a patch to malloc that touched
> every page that it handed to the application. (and trapped sig11s)

How would you expect that to work?

Several misunderstandings seem to be common regarding this issue (most
not directed at you):

 - malloc almost never fails with NULL.  This is not true, if resource
limits are set properly, any one program using huge amounts of memory
is going to hit them long before swap space is exhausted.

 - The program currently trying to get the page is the one that is
killed.

 - Actually paging in all memory is going to protect a program from
getting killed.  This is going to make it *more likely* for it to be
killed.

 - Not overcommitting doesn't consume huge amounts of reserve space
unless programs do something special.

A rough sum of memory usage can be computed by summing up all of the
process VSZs plus your stack limit times the number of processes.  How
many of you would be willing to configure that much swap space?

If you really wanted to run without overcommit, you'd only run
statically linked binaries and set your stack limits to small values.
This could be desirable for some (but not general-purpose) systems, an
option for doing this wouldn't be entirely bogus.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev

Daniel C. Sobral wrote:

> Eh? Reasonable programs *never* run into trouble. Trouble only
> happens when you have unreasonable programs around, or did not
> configure the system correctly. And if you did not configure the
> system correctly, why do you think you would be able to correctly
> estimate the stack needed for the various programs?

Your words are bad words. Exhausting of any of main resources - virtual
memory, disk space, process descriptors, file descriptors - is a terrible
situation, but one must not fight against headache with headcutting.
Every system can fall in uncontrolled state and eat all of some resource,
and kernel stack is to prevent process pool part from this, not to destruct
it. I had seen two boxes where swap was out misfortunately with bad results:
on first (FreeBSD 2.2.7), system kills the cron (sic!) process, on second
(Linux) syslogd, sendmail and some others became poisoned without any
warnings. It is totally bad behavior; kernel must be friend, not enemy.

Actions supposed enough by me for first (!) time:
1) Count in some kernel variables (readable by sysctl) overflows of virtual
memory, file descriptors, process descriptors and other critical resources.
This data must be available for watchdogs; for some systems, it is right to
reboot them immediately after some overflow, not to try to work in poisoned
state.
2) Run (in standard setup!) cron, syslogd and other important daemons from
special init slot (as Linux and possibly other systems allow), not from
startup scripts. Reason: they must be restarted when die without admin
intervention and without wrappers which can also be killed on memory low.
3) Declare thresholds for critical resources; for example, when more than
80% of virtual memory is used, prevent everybody except euid==0 or egid==0
from allocating new memory.
4) Provide special signal (SIGXMEM?) to send messages that there is memory
low and all have to shorten their memory. Daemons should interpret this
signal similarly to SIGHUP, with exec() itself and restart.

> Now comes the people saying "don't overcommit in *this* case, and
> overcommit in *that* case". Irrelevant. Programs are still getting
> killed because memory was overcommitted (with the added disadvantage
> of you not having as much memory as in a full overcommit mode).

Kernel can kill processes that try to get unexistent memory. But when it did
not prevent system from falling into overflow, it plays unfair game.

--
Netch




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral

Matthew Dillon wrote:
> 
> Something is weird here.  If the solaris people are using a
> SWAPSIZE + REALMEM VM model, they have to allow the
> allocated + reserved space go +REALMEM bytes over available swap
> space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the "swap -s" command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the "swap -s" command.

> Wait - does Solaris normally use swap files or swap partitions?
> Or is it that weird /tmp filesystem stuff?  If it normally uses swap
> files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
[EMAIL PROTECTED], though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi


[cc: list trimmed]

On Thu, 15 Jul 1999 [EMAIL PROTECTED] wrote:

> > In that scenario, the 512MB of swap I assigned to this machine would be
> > dangerously low.
> 
> With 13GB disks available for a couple of hundred bucks, my machines aren't
> going to run out of swap space any time soon, even if I commit to disk.
> 
> All I want for Christmas is a knob to disable overcommit.
> 
> --lyndon
> 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Dominic Mitchell
On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote:
> Something is weird here.  If the solaris people are using a 
> SWAPSIZE + REALMEM VM model, they have to allow the 
> allocated + reserved space go +REALMEM bytes over available swap 
> space.  If not they are using only a SWAPSIZE VM model.
> 
> Wait - does Solaris normally use swap files or swap partitions?
> Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
> files and allows holes then that explains everything.

No, swap is slice based in Solaris.  tmpfs is just a filesystem (much
like MFS) which uses swap as backing store.  I will admit to never quite
understanding the relationship of how much swap tmpfs is willing to
steal though...  Maybe I should go and read the answerbook
(http://docs.sun.com if you want a peek).
-- 
Dom Mitchell -- Palmer & Harvey McLane -- Unix Systems Administrator

In Mountain View did Larry Wall
Sedately launch a quiet plea:
That DOS, the ancient system, shall
On boxes pleasureless to all
Run Perl though lack they C.
-- 
**
This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they   
are addressed. If you have received this email in error please notify 
the system manager.

This footnote also confirms that this email message has been swept by 
MIMEsweeper for the presence of computer viruses.
**


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Dominic Mitchell

On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote:
> Something is weird here.  If the solaris people are using a 
> SWAPSIZE + REALMEM VM model, they have to allow the 
> allocated + reserved space go +REALMEM bytes over available swap 
> space.  If not they are using only a SWAPSIZE VM model.
> 
> Wait - does Solaris normally use swap files or swap partitions?
> Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
> files and allows holes then that explains everything.

No, swap is slice based in Solaris.  tmpfs is just a filesystem (much
like MFS) which uses swap as backing store.  I will admit to never quite
understanding the relationship of how much swap tmpfs is willing to
steal though...  Maybe I should go and read the answerbook
(http://docs.sun.com if you want a peek).
-- 
Dom Mitchell -- Palmer & Harvey McLane -- Unix Systems Administrator

In Mountain View did Larry Wall
Sedately launch a quiet plea:
That DOS, the ancient system, shall
On boxes pleasureless to all
Run Perl though lack they C.
-- 
**
This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they   
are addressed. If you have received this email in error please notify 
the system manager.

This footnote also confirms that this email message has been swept by 
MIMEsweeper for the presence of computer viruses.
**


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:d...@newsguy.com

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Daniel C. Sobral
Technical follow-up:

Contrary to what I previously said, a number of tests reveal that
Solaris, indeed, does not overcommit. All non-read only segments,
and all malloc()ed memory is reserved upon exec() or fork(), and the
reserved memory is not allowed to exceed the total memory. It makes
extensive use of read only DATA segments, and has a NON_RESERVE
mmap() flag.

Though the foot firmly planted in my mouth ought to prevent me from
saying anything else, I must say that it does explain a few things
to me...

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:[EMAIL PROTECTED]

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Daniel C. Sobral

Technical follow-up:

Contrary to what I previously said, a number of tests reveal that
Solaris, indeed, does not overcommit. All non-read only segments,
and all malloc()ed memory is reserved upon exec() or fork(), and the
reserved memory is not allowed to exceed the total memory. It makes
extensive use of read only DATA segments, and has a NON_RESERVE
mmap() flag.

Though the foot firmly planted in my mouth ought to prevent me from
saying anything else, I must say that it does explain a few things
to me...

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:> In that scenario, the 512MB of swap I assigned to this machine would be
:> dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon
> And what I'm pretty sure the majority of the readers on this list want
> is for those of you who really think it's necessary to do it yourselves.
> 
> What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
> wonder whether that's significant...

Sheldon, if you can't contribute something useful, then shut up.

If I have to do it myself, I will.



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:> In that scenario, the 512MB of swap I assigned to this machine would be
:> dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon

> And what I'm pretty sure the majority of the readers on this list want
> is for those of you who really think it's necessary to do it yourselves.
> 
> What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
> wonder whether that's significant...

Sheldon, if you can't contribute something useful, then shut up.

If I have to do it myself, I will.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green
   
   > All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted "wanting no overcommit" are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn


On Thu, 15 Jul 1999 17:53:52 CST, lyn...@orthanc.ab.ca wrote:

> All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green

   
   > All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted "wanting no overcommit" are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon
> In that scenario, the 512MB of swap I assigned to this machine would be
> dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
Here is what I get from one of BEST's mail & www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  <---
0x870000x373000 738   738 rwx default   <---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  <---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  <---
0xefbde000 0xefbfe0001414 rwx COW swap  <---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn



On Thu, 15 Jul 1999 17:53:52 CST, [EMAIL PROTECTED] wrote:

> All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message




Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon

> In that scenario, the 512MB of swap I assigned to this machine would be
> dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug
> If this is correct, then solaris is using a VMSPACE = SWAPSPACE
> model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, sth...@nethelp.no


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Jonathan Lemon
In article 
 you 
write:
>::-s Print summary information  about  total  swap
>::   space usage and availability:
>::
>::  allocated   The total amount of swap space
>::  (in  1024-byte blocks)
>::  currently allocated for use as
>::  backing store.
>::
>::  reservedThe total amount of swap space
>::  (in   1024-bytes  blocks)  not
>::  currentlyallocated,but
>::  claimed by memory mappings for
>::  possible future use.
>::
>::  usedThe total amount of swap space
>::  (in  1024-byte blocks) that is
>::  either allocated or reserved.
>:--
>:soda
>
>It would be really easy to test this.
>
>Write a program that malloc's 32MB of space and touches it,
>then sleeps 10 seconds and forks, with both child and parent
>sleeping afterwords.  ( the parent and the forked child should
>not touch the memory after the fork occurs ).
>
>Do a pstat -s before, after the initial touch, and after
>the fork.  If you do not see the reserved swap space jump
>by 32MB after the fork, it isn't what you thought it was.

aladdin[5:32pm]> prtconf
System Configuration:  Sun Microsystems  i86pc
Memory size: 128 Megabytes

aladdin[5:41pm]> uname -a
SunOS aladdin 5.6 Generic_105182-14 i86pc i386


total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail
malloced 32MB...
total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail
touched...
total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail
forking...
total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail
touching again (parent)...
touching again (child)...
total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail
exiting...
exiting...
total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail

--
Jonathan


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k 
available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k 
available
:
:[there has been a little background activity, but the numbers speak for 
themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

Here is what I get from one of BEST's mail & www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  <---
0x870000x373000 738   738 rwx default   <---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  <---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  <---
0xefbde000 0xefbfe0001414 rwx COW swap  <---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug

> If this is correct, then solaris is using a VMSPACE = SWAPSPACE
> model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Jonathan Lemon

In article [EMAIL PROTECTED]> you 
write:
>::-s Print summary information  about  total  swap
>::   space usage and availability:
>::
>::  allocated   The total amount of swap space
>::  (in  1024-byte blocks)
>::  currently allocated for use as
>::  backing store.
>::
>::  reservedThe total amount of swap space
>::  (in   1024-bytes  blocks)  not
>::  currentlyallocated,but
>::  claimed by memory mappings for
>::  possible future use.
>::
>::  usedThe total amount of swap space
>::  (in  1024-byte blocks) that is
>::  either allocated or reserved.
>:--
>:soda
>
>It would be really easy to test this.
>
>Write a program that malloc's 32MB of space and touches it,
>then sleeps 10 seconds and forks, with both child and parent
>sleeping afterwords.  ( the parent and the forked child should
>not touch the memory after the fork occurs ).
>
>Do a pstat -s before, after the initial touch, and after
>the fork.  If you do not see the reserved swap space jump
>by 32MB after the fork, it isn't what you thought it was.

aladdin[5:32pm]> prtconf
System Configuration:  Sun Microsystems  i86pc
Memory size: 128 Megabytes

aladdin[5:41pm]> uname -a
SunOS aladdin 5.6 Generic_105182-14 i86pc i386


total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail
malloced 32MB...
total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail
touched...
total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail
forking...
total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail
touching again (parent)...
touching again (child)...
total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail
exiting...
exiting...
total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail

--
Jonathan


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k available
:
:[there has been a little background activity, but the numbers speak for themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki
On Wed, 14 Jul 1999, John Nemeth wrote:

> On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
> } Garance A Drosihn wrote:
> } > At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
> } > > In which case the program that consumed all memory will be killed.
> } > > The program killed is +NOT+ the one demanding memory, it's the one
> } > > with most of it.
> } > 
> } > But that isn't always the best process to have killed off...
> } 
> } Sure it is. :-) Let's see...
> 
>  This statement is absurd.  Only a comptetant admin can decide
> which process can be killed.  No arbitrary decision is going to be
> correct.
> 
> } > It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
"totally disposable, kill whenever you want", and class 1 means "never try
to kill me". Of course, most processes would get some default value, and
superuser could "renice" them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//   WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small & Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki

On Wed, 14 Jul 1999, John Nemeth wrote:

> On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
> } Garance A Drosihn wrote:
> } > At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
> } > > In which case the program that consumed all memory will be killed.
> } > > The program killed is +NOT+ the one demanding memory, it's the one
> } > > with most of it.
> } > 
> } > But that isn't always the best process to have killed off...
> } 
> } Sure it is. :-) Let's see...
> 
>  This statement is absurd.  Only a comptetant admin can decide
> which process can be killed.  No arbitrary decision is going to be
> correct.
> 
> } > It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
"totally disposable, kill whenever you want", and class 1 means "never try
to kill me". Of course, most processes would get some default value, and
superuser could "renice" them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//  <[EMAIL PROTECTED]> WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small & Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda
> On Thu, 15 Jul 1999 11:09:01 -0700 (PDT),
Matthew Dillon  said:

> Umm... how are you getting the reserved numbers? 

"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:

:-s Print summary information  about  total  swap
:   space usage and availability:
:
:  allocated   The total amount of swap space
:  (in  1024-byte blocks)
:  currently allocated for use as
:  backing store.
:
:  reservedThe total amount of swap space
:  (in   1024-bytes  blocks)  not
:  currentlyallocated,but
:  claimed by memory mappings for
:  possible future use.
:
:  usedThe total amount of swap space
:  (in  1024-byte blocks) that is
:  either allocated or reserved.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda
> On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
>> Uh... like any modern unix, Solaris overcommits.

> On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
"Eduardo E. Horvath"  said:

> Where do you guys get this misinformation?  
:
> Note the `19464k reserved'; that space has been reserved but not yet
> allocated.

Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda

> On Thu, 15 Jul 1999 11:09:01 -0700 (PDT),
Matthew Dillon <[EMAIL PROTECTED]> said:

> Umm... how are you getting the reserved numbers? 

"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:

:-s Print summary information  about  total  swap
:   space usage and availability:
:
:  allocated   The total amount of swap space
:  (in  1024-byte blocks)
:  currently allocated for use as
:  backing store.
:
:  reservedThe total amount of swap space
:  (in   1024-bytes  blocks)  not
:  currentlyallocated,but
:  claimed by memory mappings for
:  possible future use.
:
:  usedThe total amount of swap space
:  (in  1024-byte blocks) that is
:  either allocated or reserved.
--
soda


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda

> On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
>> Uh... like any modern unix, Solaris overcommits.

> On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
"Eduardo E. Horvath" <[EMAIL PROTECTED]> said:

> Where do you guys get this misinformation?  
:
> Note the `19464k reserved'; that space has been reserved but not yet
> allocated.

Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn
At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
>If 1G isn't enough, spend another $30 and throw 2G of swap
>online.  Or perhaps dedicate an entire $150 disk and throw
>6+ GB of swap online.
>
>The equivalent setup using a non-overcommit model would require
>considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as "just casual questions" seem to be taken as "demands that
something be done, RIGHT NOW".

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my "casual questions" sometime in a different
context...

---
Garance Alistair Drosehn   =   g...@eclipse.acs.rpi.edu
Senior Systems Programmer  or  dro...@rpi.edu
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-15 Thread Daniel C. Sobral
Kevin Schoedel wrote:
> 
> >Imagine a reasonably big
> >program, like Netscape or Emacs, of which you usually just use a
> >subset of features. There can easily be many megabytes of code and
> >data in them you never actually use, or you don't _usually_ use
> >(like the people who use emacs like it was vi :). Without
> >overcommit, you need to allocate all that memory for the code, no
> >matter whether you end up using it or not. With overcommit, there is
> >no such problem.
> 
> Code, static data, and not-yet-written writable data should be backed by
> the executable file, not by swap space, so unused code and tables should
> not be a problem.

TEXT should be backed by the executable, as long a the program
doesn't change it to read/write. That's not the code I was refering
to. Not-yet-written blah-blah-blah should be backed by:

1) The executable file if you are overcommitting.
2) RAM/Swap if you are not. If you don't do this, you are
overcommitting. Proof: let the system exaust it's memory. Change a
single byte in the not-yet-written stuff. Now you need more memory
than you have to comply with a regular operation (like changing the
value of a global variable), which means you overcommitted.

Now comes the people saying "don't overcommit in *this* case, and
overcommit in *that* case". Irrelevant. Programs are still getting
killed because memory was overcommitted (with the added disadvantage
of you not having as much memory as in a full overcommit mode).

> Stack is more interesting. There might be a place for a global overcommit
> switch. I think I'd be happier with a scheme in which stack the first
> page or first few pages are committed (so that reasonable programs will
> never run into trouble) and remaining stack is over-/un-committed by
> default, along with means for unusual programs to commit (and/or test
> commitability of) subsequent pages.

Eh? Reasonable programs *never* run into trouble. Trouble only
happens when you have unreasonable programs around, or did not
configure the system correctly. And if you did not configure the
system correctly, why do you think you would be able to correctly
estimate the stack needed for the various programs?

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn

At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
>If 1G isn't enough, spend another $30 and throw 2G of swap
>online.  Or perhaps dedicate an entire $150 disk and throw
>6+ GB of swap online.
>
>The equivalent setup using a non-overcommit model would require
>considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as "just casual questions" seem to be taken as "demands that
something be done, RIGHT NOW".

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my "casual questions" sometime in a different
context...

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-15 Thread Daniel C. Sobral

Kevin Schoedel wrote:
> 
> >Imagine a reasonably big
> >program, like Netscape or Emacs, of which you usually just use a
> >subset of features. There can easily be many megabytes of code and
> >data in them you never actually use, or you don't _usually_ use
> >(like the people who use emacs like it was vi :). Without
> >overcommit, you need to allocate all that memory for the code, no
> >matter whether you end up using it or not. With overcommit, there is
> >no such problem.
> 
> Code, static data, and not-yet-written writable data should be backed by
> the executable file, not by swap space, so unused code and tables should
> not be a problem.

TEXT should be backed by the executable, as long a the program
doesn't change it to read/write. That's not the code I was refering
to. Not-yet-written blah-blah-blah should be backed by:

1) The executable file if you are overcommitting.
2) RAM/Swap if you are not. If you don't do this, you are
overcommitting. Proof: let the system exaust it's memory. Change a
single byte in the not-yet-written stuff. Now you need more memory
than you have to comply with a regular operation (like changing the
value of a global variable), which means you overcommitted.

Now comes the people saying "don't overcommit in *this* case, and
overcommit in *that* case". Irrelevant. Programs are still getting
killed because memory was overcommitted (with the added disadvantage
of you not having as much memory as in a full overcommit mode).

> Stack is more interesting. There might be a place for a global overcommit
> switch. I think I'd be happier with a scheme in which stack the first
> page or first few pages are committed (so that reasonable programs will
> never run into trouble) and remaining stack is over-/un-committed by
> default, along with means for unusual programs to commit (and/or test
> commitability of) subsequent pages.

Eh? Reasonable programs *never* run into trouble. Trouble only
happens when you have unreasonable programs around, or did not
configure the system correctly. And if you did not configure the
system correctly, why do you think you would be able to correctly
estimate the stack needed for the various programs?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Michael Schuster - TSC SunOS Germany
Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
michael.schus...@germany.sun.com


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Michael Schuster - TSC SunOS Germany

Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Tim Vanderhoek
On Thu, Jul 15, 1999 at 01:48:40PM +0900, Daniel C. Sobral wrote:
> > 
> > If you have a lot of users, all of which have buggy programs which eat
> > a lot of memory, per-user swap quotas don't necessarily save your butt.
> 
> The chance of these buggy programs running at the same time is not
> exactly high...

Well, it is higher than your probably giving credit for.  Suppose
Professor A. hands-out X assignment.  Unfortunately, some piece of
code he supplied to his, let's say 200 students ignorant first year
students, has this particular memory-eating bug.  Being ignorant
first-year students, they will notice something is wrong, assume
the problem is their fault, and repeat the exact same procedure
five or so times.  Again, being ignorant first year students, they
will probably all be using the same shell server.

To make things worse, some wise-ass may have told a bunch of them how
to use ulimit or limit in order to push their available resources as
high as possible (perhaps very high, since the admin hopefully
recognizes that sometimes students need high resource limits to
perform research).

Fortunately, overcommit rescues the machine and kills those buggy
programs instead of letting them spin around for ever in some kind of
"malloc() failed ... must be temporary failure, wait and retry".


-- 
This is my .signature which gets appended to the end of my messages.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Mike Smith
> 
> And what do you do, then, with the processes that happen to have
> legitimate use for more stack?
> 
> Or maybe you just find out how much stack each process uses, and
> then set limits appropriate for each one? Which is the equivalent of
> setting limits to each user, of course...

You get a little program, like eg. Xenix and Minix had, which lets you 
modify the executable header to indicate how much stack the system 
should reserve.  If the program decides to use more stack for some 
reason, then it dies; this is in effect "stack overcommit".  8)

-- 
\\  The mind's the standard   \\  Mike Smith
\\  of the man.   \\  msm...@freebsd.org
\\-- Joseph Merrick   \\  msm...@cdrom.com




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Tim Vanderhoek

On Thu, Jul 15, 1999 at 01:48:40PM +0900, Daniel C. Sobral wrote:
> > 
> > If you have a lot of users, all of which have buggy programs which eat
> > a lot of memory, per-user swap quotas don't necessarily save your butt.
> 
> The chance of these buggy programs running at the same time is not
> exactly high...

Well, it is higher than your probably giving credit for.  Suppose
Professor A. hands-out X assignment.  Unfortunately, some piece of
code he supplied to his, let's say 200 students ignorant first year
students, has this particular memory-eating bug.  Being ignorant
first-year students, they will notice something is wrong, assume
the problem is their fault, and repeat the exact same procedure
five or so times.  Again, being ignorant first year students, they
will probably all be using the same shell server.

To make things worse, some wise-ass may have told a bunch of them how
to use ulimit or limit in order to push their available resources as
high as possible (perhaps very high, since the admin hopefully
recognizes that sometimes students need high resource limits to
perform research).

Fortunately, overcommit rescues the machine and kills those buggy
programs instead of letting them spin around for ever in some kind of
"malloc() failed ... must be temporary failure, wait and retry".


-- 
This is my .signature which gets appended to the end of my messages.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
lyn...@orthanc.ab.ca wrote:
> 
> What it so evil about having a reasonably intelligent malloc() that
> tells the truth, and returns unused memory to the system? Overcommit
> is for lazy programmers, plain and simple. At least the SGI documentation
> about overcommit admits that (or at least, did at one time).

Yes. So is high-level languages, as a matter of fact. True
memory-conscious programmers will never use anything besides
assembler.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
Jason Thorpe wrote:
> 
> If you have a lot of users, all of which have buggy programs which eat
> a lot of memory, per-user swap quotas don't necessarily save your butt.

The chance of these buggy programs running at the same time is not
exactly high...

> And maybe the individual programs didn't encounter their resource limits.
> 
> ...but the sheer number of these runaway things caused the overcommit to
> be a problem.  If malloc() or whatever had actually returned NULL at the
> right time (i.e. as backing store was about to become overcommitted), then
> these runaway processes would have stopped running away (they would have
> gotten a SIGSEGV and died).
> 
> Anyhow, my "lame undergrads" example comes from a time when PCs weren't
> really powerful enough for the job (or something; anyhow, we didn't have
> any in the department :-).  My example is from a Sequent Balance (16
> ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant).

So, tell me... when NetBSD gets it's non-overcommit switch, would
you use it in the environment you describe?

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
John Nemeth wrote:
> 
>  The machine in question has run out of swap, due to unforseeable
> excessive memory demands.  This was accompanied by processes
> complaining about not being able to allocate memory and then cleaning
> up after themselves.  I did not see random processes being killed
> because of it.  That is the way things should be.  From this, I can
> assume that the OS doesn't overcommit.  In case, you're wondering, the
> OS in question is:
> 
> SunOS 5.5 Generic_103093-25 sun4u sparc

Uh... like any modern unix, Solaris overcommits.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
Michael Richardson wrote:
> 
> Ben> Tell me, Mr. Nemeth, has this ever happened to you?  Have you ever
> Ben> come *close*?
> 
>   Uh, since we don't run overcommit, the answer is specifically *NO*.

And what system do you run?

>   I have had it happen on other systems. (Solaris, AIX) It was very
> mystifying to diagnose. Sure, the systems were misconfigured for what we
> were trying to do, but if I wanted build a custom system for every
> application well... I'd be running NT.

I have to agree about the mystifying diagnose... Specially when they
*don't* page like hell.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
Michael Richardson wrote:
> 
>   No, I don't agree.
> 
>   This is a biggest argument against solving the overcommit situation with
> SIGKILL. I have no problem with overcommit as a concept, I have a problem
> with being unable to keep my possibly big processes (X, rpc.nisd,
> etc. depending on cicumstances) from being victims.

It is no more difficult to protect big processes than it is to
create user limits.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
John Nemeth wrote:
> 
>  On one system I administrate, the largest process is typically
> rpc.nisd (the NIS+ server daemon).  Killing that process would be a
> bad thing (TM).  You're talking about killing random processes.  This
> is no way to run a system.  It is not possible for any arbitrary
> decision to always hit the correct process.  That is a decision that
> must be made by a competent admin.  This is the biggest argument
> against overcommit:  there is no way to gracefully recover from an
> out of memory situation, and that makes for an unreliable system.

If you run out of memory, it is either a misconfigured system, or a
runaway program. If a program is runaway, then:

1) It is larger than your typical rpc.nisd.
2) You cannot tell the system a priori to kill it, because you don't
know about it (or else, you wouldn't be running it in first place).

A system running in overcommit assumes that you have it correctly
configured so it will *not* run out of memory under normal
conditions. This happens to be the same assumption Unix does.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Mike Smith

> 
> And what do you do, then, with the processes that happen to have
> legitimate use for more stack?
> 
> Or maybe you just find out how much stack each process uses, and
> then set limits appropriate for each one? Which is the equivalent of
> setting limits to each user, of course...

You get a little program, like eg. Xenix and Minix had, which lets you 
modify the executable header to indicate how much stack the system 
should reserve.  If the program decides to use more stack for some 
reason, then it dies; this is in effect "stack overcommit".  8)

-- 
\\  The mind's the standard   \\  Mike Smith
\\  of the man.   \\  [EMAIL PROTECTED]
\\-- Joseph Merrick   \\  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon

:
:On Jul 15, 12:20am, "Daniel C. Sobral" wrote:
:} "Charles M. Hannum" wrote:
:} > 
:} > That's also objectively false.  Most such environments I've had
:} > experience with are, in fact, multi-user systems.  As you've pointed
:} > out yourself, there is no combination of resource limits and whatnot
:} > that are guaranteed to prevent `crashing' a multi-user system due to
:} > overcommit.  My simulation should not be axed because of a bug in
:} > someone else's program.  (This is also not hypothetical.  There was a
:} > bug in one version of bash that caused it to consume all the memory it
:} > could and then fall over.)
:} 
:} In which case the program that consumed all memory will be killed.
:} The program killed is +NOT+ the one demanding memory, it's the one
:} with most of it.
:
: On one system I administrate, the largest process is typically
:rpc.nisd (the NIS+ server daemon).  Killing that process would be a
:bad thing (TM).  You're talking about killing random processes.  This
:is no way to run a system.  It is not possible for any arbitrary
:decision to always hit the correct process.  That is a decision that
:must be made by a competent admin.  This is the biggest argument
:against overcommit:  there is no way to gracefully recover from an
:out of memory situation, and that makes for an unreliable system.
:
:}-- End of excerpt from "Daniel C. Sobral"

... and the chance of that system running out of swap space
is?  

The machine has hit the wall, the admin can't login.  What 
is the kernel to do?

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon
:> I mean, jeeze, the reservation for the program stack alone would eat
:> up all your available swap space!  What is a reasonable stack size?  The
:> system defaults to 8MB.  Do we rewrite every program to specify its own
:> stack size?  How do we account for architectural differences?  
:
:The alternative is to rewrite every program that assumes the semantics
:of malloc() are being followed. The problem I have as an applications
:writer is that I tend to believe malloc. To pick a specific example,
:our IMAP client takes steps to ensure it won't run out of memory in
:critical sections. We maintain a "rainy day" pool block of memory. If
:...
:--lyndon

We just put a cap on the number of imap clients we allow running
at any given moment... say, a few hundred.  Not only does it
work just dandy, it also prevents the machine from overloading
and gives us a nice "you may want to look into this" alarm.

We do the same thing with sendmail, popper, the web server,
named, and every other service which can be forked.

This also prevents one subsystem from overly interfering with
another.   For example, if popper saturates it does not overly
interfere with imapd operation.

The limit is set to around 3x the monday peak load, and 
sufficient swap is configured to deal with it should the limit
be hit.

Problem solved.  No fancy modifications required.  If any of
these subsystems actually ever got close to using all available
swap, the other subsystems would be up the creek anyway so, really,
it doesn't make much sense hacking the source to allow the subsystem
to run into the wall anyway.

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon
:Our IMAP server routinely show a footprint of about 1MB private storage.
:This is constant for most operations. However, when you get into doing
:SEARCH and SORT, there are certain cases where we need memory, sometimes
:a *lot* of memory.
:
:Your proposal is that my *well behaved* application should be arbitrarily
:killed, leaving the client stuck with a) no results and b) no IMAP
:connection, in this situation. (And think threaded. That one server
:could be handling *hundreds* of clients.) This is preferable to
:returning a NULL to the malloc() request, which I can handle
:gracefully by simply returning a NO response to the IMAP client?
:
:What it so evil about having a reasonably intelligent malloc() that
:tells the truth, and returns unused memory to the system? Overcommit
:is for lazy programmers, plain and simple. At least the SGI documentation
:about overcommit admits that (or at least, did at one time).
:
:--lyndon

If you are running an IMAP server that regularly runs out of swap
space, you have a configuration problem which needs to be addressed.
It's as simple as that.  What you are putting forth is an example
of something that will never happen on a properly configured 
server.

In regards to the general case where one is running third-party 
applications.  Here you are assuming that you can go in and modify
every single piece of software running on the machine to deal
with malloc() returning NULL.  Because if you don't, the machine
isn't going to be very stable.

Not only that, you are assuming that you will make the correct
decision on what action to take when malloc() *does* return NULL.
If you decide to return an error code but not exit, what happens
when a potential blowup situation results in thousands of imap
processes being run on the system, and NONE of them exit when
their malloc() fails?

The problem is a whole lot more complex then simply having the
OS return NULL from a malloc().  Currently the OS kills processes
as a last resort.  The idea is that no nominally running system
runs out of swap.  Now you propose to take away the kernel's
ability to recover some memory as a last resort and instead
put it into the hands of the very user or root-run processes
that are causing the problem in the first place!  A much better
solution would be to write a simple watchdog script that notices
when swap space is low and does the right thing -- e.g. kills
the non-essential processes and leaves the essential ones alone.
Then the kernel never actually reaches a state of last-resort.

-Matt
Matthew Dillon 




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral
Sergey Babkin wrote:
> 
> > It would be nice to have a way to indicate that, a la SIGDANGER.
> 
> Another option may be to add something like "importance classes".
> Suppose we assign an one-byte "importance level" to each process.
> When we get out of swap we start killing processes with the lowest
> importance level. This seems to be both easy to implement and
> a rather robust solution.

This is as easy to do as setting limits, which has the added benefit
of not having any process killed.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral

Jason Thorpe wrote:
> 
> If you have a lot of users, all of which have buggy programs which eat
> a lot of memory, per-user swap quotas don't necessarily save your butt.

The chance of these buggy programs running at the same time is not
exactly high...

> And maybe the individual programs didn't encounter their resource limits.
> 
> ...but the sheer number of these runaway things caused the overcommit to
> be a problem.  If malloc() or whatever had actually returned NULL at the
> right time (i.e. as backing store was about to become overcommitted), then
> these runaway processes would have stopped running away (they would have
> gotten a SIGSEGV and died).
> 
> Anyhow, my "lame undergrads" example comes from a time when PCs weren't
> really powerful enough for the job (or something; anyhow, we didn't have
> any in the department :-).  My example is from a Sequent Balance (16
> ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant).

So, tell me... when NetBSD gets it's non-overcommit switch, would
you use it in the environment you describe?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral

[EMAIL PROTECTED] wrote:
> 
> What it so evil about having a reasonably intelligent malloc() that
> tells the truth, and returns unused memory to the system? Overcommit
> is for lazy programmers, plain and simple. At least the SGI documentation
> about overcommit admits that (or at least, did at one time).

Yes. So is high-level languages, as a matter of fact. True
memory-conscious programmers will never use anything besides
assembler.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



  1   2   3   4   5   6   >