Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon


:
:It results sometimes in out of swap, too.
:
: Inetd is rate-limited by default nowadays, so this really doesn't apply.
:
:It really does apply. Inetd limits incoming connections per minute, not per
:second. It is possible to use minute limit in a few seconds and cause a high
:load. Sendmail is worse than inetd; it cannot limit incoming rate on
:
:Netch

You can specify a maximum fork limit for inetd on a per-service basis.

You are a year or two too late on these things.  A great many improvements
have been made to programs like sendmail and inetd explicitly to deal 
with overload situations.  Web servers too.  These were fairly simple
changes as well.  For sendmail it was as simple as making MaxDaemonChildren
apply to queue runs - I submitted that one to Eric Allman two years ago
and it's been a part of sendmail since then.  For inetd it is the -c, -C,
and -R options (which can be specified on a per-service basis as well).
Dima and I added the -R option back in 1997 specifically to help with
DOS attacks.

Sendmail is not an issue when properly configured.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev
Brian F. Feldman wrote:

 There are other ways.  For example, even if a user account is resource
 limited, root processes (such as sendmail, popper, identd, and so forth)
 are not.  Attacks against these servers generally result in very high
 loads and sometimes make it difficult to login to fix the problem, but do
 not result in running out of swap.

It results sometimes in out of swap, too.

 Inetd is rate-limited by default nowadays, so this really doesn't apply.

It really does apply. Inetd limits incoming connections per minute, not per
second. It is possible to use minute limit in a few seconds and cause a high
load. Sendmail is worse than inetd; it cannot limit incoming rate on
established connection. Butenko's (bute...@stalker.com) DoS attack to
sendmail is to send thousands of letters to local user thru fast
netork connection (i.e., Ethernet) thru one established TCP connection; the
only barrier is testing of LA before sending '250 XXX message accepted to
delivery' reply and fork-and-deliver-or-queue-and-exit decision, but
attacker can send too many letters in few seconds; a hundreds of delivery
processes locked on /usr/libexec/mail.local mailbox waiting. LA counts
system state characteristics of last minute and thus is similar to average
patients' temperature per hospital per last year. ;( I have seen a variant
of this attack on my mail hosts, when host with 6000 letters in mail queue
(mail2news server) sent all its mail to smarthost (uucp spool server); after
~500 letters, sendmail on smarthost closed port 25 on RefuseLA; it was saved
from out-of-swap only because domain resolving spent some time. The only
mechanism against such type of attack I can imagine is to sm_sleep(1) at
mail from: smtp server code or before '250 Message accepted for delivery'.
For inetd, we must limit connections per second, not per minute.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Valentin Nechayev
Matthew Dillon wrote:

 Give me a shell and I can crash any machine.

Oh. ;|

 A good example of this is sendmail.  Before the MaxDaemonChildren and
 MaxArticleSize options, it was possible for sendmail to overcommit a
 machine.  In this case the overcommit that can occur is with I/O, not
 swap.  As a general performance rule, you have to set MaxDaemonChildren
 and MaxArticleSize to prevent the overcommit from occuring.  This is a
 function of sendmail, not a function of the kernel.

Sigh. ((c)you) Sendmail can overcommit a machine with right set of
MaxDaemonChildren, MaxArticleSize, QueueLA  RefuseLA options - I have seen
such situations. MaxDaemonChildren limits only number of main processes for
incoming connections (plus queue run processes). For each connection, after
main from: and until accepting message, server process for incoming
connection forks child which accepts recipient list and letter body. After
message accepting, that child can fork delivery process. A queue run process
with O ForkEachJob=true option, which is default, can create a delivery
process for each queue job (in my practice, queue of more than 1000 jobs is
ordinary event). All these forks depend only on one test - get current LA
and compare it with QueueLA - which fail when high load appeared less than
one minute ago. To prevent its overcommit, (I interfere in details with
parallel message) the minimal (and possibly not enough) setup set is:
1) patch - insert sm_sleep(1) to server subprocess code before accepted
reply - limit incoming mail rate;
2) Desrease QueueLA for listening daemon to sub-minimal value
(i.e.2);
3) Increase QueueLA for queue running daemon to high values (i.e.50) and set
them OForkEachJob=false.

But most of these tunings are indirect. A direct tuning invented
experimentally on my mail servers is specially hacked pstat program that
returns 1 if either swap or file descriptors are used more than 2/3, 0
otherwise; on getting 1, sendmail stops delivering. But, it's pity, this
check is unportable.

(P.S. Don't tell me change MTA; this is fully another question.)

 Another good example is a web server.  A web server must have specific
 limitations on the number of simultanious connections it is allowed
 to handle at once and on the number of CGI's or other auxillary programs
 that are allowed to be running at any given time.  The overcommit issue
 here has nothing to do with swap and everything to do with performance.
 Specifically, these limitations exist to avoid cascade failures.

As in sendmail case, you propose make some calculations (which are difficult
and non-trivial to newbies) to make appreciations of nesessary resources.
Another way, which is imho more acceptable, is to provide not hard barriers
(SIGKILL on overcommitting), but soft barriers (i.e., stop memory allocating
for non-wheel users when memory begins to exhaust). Extra 64M of memory or a
disk for swap is commonly quite more cheaper than profitloss on critical
service crash.

 In the same manner any truely critical system server must handle the
 resource management itself to deal with all sorts of problem situations,
 including memory.  You do not need to build any of this control into the
 kernel.

No, we need it. Not every server can be patched for such tests (due to loss
of sources or another reason), not every admin can make nesessary patches.
Kernel must help in it.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-17 Thread Matthew Dillon

:
:It results sometimes in out of swap, too.
:
: Inetd is rate-limited by default nowadays, so this really doesn't apply.
:
:It really does apply. Inetd limits incoming connections per minute, not per
:second. It is possible to use minute limit in a few seconds and cause a high
:load. Sendmail is worse than inetd; it cannot limit incoming rate on
:
:Netch

You can specify a maximum fork limit for inetd on a per-service basis.

You are a year or two too late on these things.  A great many improvements
have been made to programs like sendmail and inetd explicitly to deal 
with overload situations.  Web servers too.  These were fairly simple
changes as well.  For sendmail it was as simple as making MaxDaemonChildren
apply to queue runs - I submitted that one to Eric Allman two years ago
and it's been a part of sendmail since then.  For inetd it is the -c, -C,
and -R options (which can be specified on a per-service basis as well).
Dima and I added the -R option back in 1997 specifically to help with
DOS attacks.

Sendmail is not an issue when properly configured.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi


[cc: list trimmed]

On Thu, 15 Jul 1999 [EMAIL PROTECTED] wrote:

  In that scenario, the 512MB of swap I assigned to this machine would be
  dangerously low.
 
 With 13GB disks available for a couple of hundred bucks, my machines aren't
 going to run out of swap space any time soon, even if I commit to disk.
 
 All I want for Christmas is a knob to disable overcommit.
 
 --lyndon
 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral

Matthew Dillon wrote:
 
 Something is weird here.  If the solaris people are using a
 SWAPSIZE + REALMEM VM model, they have to allow the
 allocated + reserved space go +REALMEM bytes over available swap
 space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the "swap -s" command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the "swap -s" command.

 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap
 files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
[EMAIL PROTECTED], though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen


[EMAIL PROTECTED] (Chris G. Demetriou) writes:

 Matthew Dillon [EMAIL PROTECTED] writes:
  The text size of a program is irrelevant, because swap is never
  allocated for it.  The data and BSS are only relevant when they

No, you can mprotect read-only vnode mappings to writable.  Most
things wouldn't be hurt badly if this changed, though, I suspect that
this already varies between operating systems.

  are modified.
  
  The only thing swap is ever used for is the dynamic allocation of memory.
  There are three ways to do it:  sbrk(), mmap(... MAP_ANON), or
  mmap(... MAP_PRIVATE).

 yup, almost: not all MAP_PRIVATE mappings need backing store, only
 MAP_PRIVATE and writeable mappings.  (MAP_PRIVATE does _not_ guarantee
 that you won't see modifications made via other MAP_SHARED mappings.)

...but in *this* case, you certainly shouldn't allow mprotect to fail
(with what, ENOMEM?).

It's certainly counterintuitive to me that mprotect could fail due to
a resource shortage.

 Actually, only now have you brought that up.  And, that's very system
 dependent.  On NetBSD/i386 the default is 2MB, and, it's worth noting
 that you only need to reserve as much as the current stack limit
 allows (after that, you're going to get a signal anyway, and if more

So what setrlimit accepts depends on how much memory is available?

Ok, programs changing their stack limit are rare, but this would still
be another API change.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham



"Daniel C. Sobral" wrote:

  It would be nice to have a way to indicate that, a la SIGDANGER.
 
 Ok, everybody is avoiding this, so I'll comment. Yes, this would be
 interesting, and a good implementation will very probably be
 committed. *BUT*, this is not as useful as it seems. Since the
 correct solution is buy more memory/increase swap (correct solution
 for our target markets, anyway), there is little incentive to
 implement it.
 
 So, I think people who can answer the above is thinking like "Well,
 it is useful, but it's not useful enough for me to spend my time on
 it, and I'm sure as hell don't want to write mini-papers on why it's
 not that useful".
 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee

On Fri, 16 Jul 1999, Matthew Dillon wrote:

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's are
 certified for any of that stuff that I know of.
 
 What's next:  A space shot?  These what-if scenarios are getting
 ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn

On Fri, 16 Jul 1999, Matthew Dillon wrote:


:  Well, NetBSD is slated to be used in the 'Space Acceleration
:  Measurement System II', measuring the microgravity environment on
:  the International Space Station using a distributed system based
:  on several NetBSD/i386 boxes.
:
:  Sometimes your 'what-if' senarios are others' standard operating
:  procedures.
:
:  David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.


It's probably an awfully expensive experiment (putting things into space
is not cheap)

From a financial viewpoint that may be considered critical.

Cheers,

Al


--
Alan Horn - Sysadmin - Dreamworks (+1 818 695 6256) - [EMAIL PROTECTED]
  I am Connor MacLeod of the Clan MacLeod. I was born in 1518 in the
village of Glenfinnan on the shores of Loch Sheil, and I am immortal.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's are
 certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

: I'm sorry, but when you write code for a safety related system you
: do not dynamically allocate memory at all.  It's all essentially static.
: There is no issue with the memory resource.  Besides, none of the BSD's are
: certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:[EMAIL PROTECTED]
:



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt

On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

 Technical follow-up:
 
 Contrary to what I previously said, a number of tests reveal that
 Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Dominic Mitchell
On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote:
 Something is weird here.  If the solaris people are using a 
 SWAPSIZE + REALMEM VM model, they have to allow the 
 allocated + reserved space go +REALMEM bytes over available swap 
 space.  If not they are using only a SWAPSIZE VM model.
 
 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
 files and allows holes then that explains everything.

No, swap is slice based in Solaris.  tmpfs is just a filesystem (much
like MFS) which uses swap as backing store.  I will admit to never quite
understanding the relationship of how much swap tmpfs is willing to
steal though...  Maybe I should go and read the answerbook
(http://docs.sun.com if you want a peek).
-- 
Dom Mitchell -- Palmer  Harvey McLane -- Unix Systems Administrator

In Mountain View did Larry Wall
Sedately launch a quiet plea:
That DOS, the ancient system, shall
On boxes pleasureless to all
Run Perl though lack they C.
-- 
**
This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they   
are addressed. If you have received this email in error please notify 
the system manager.

This footnote also confirms that this email message has been swept by 
MIMEsweeper for the presence of computer viruses.
**


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi

[cc: list trimmed]

On Thu, 15 Jul 1999 lyn...@orthanc.ab.ca wrote:

  In that scenario, the 512MB of swap I assigned to this machine would be
  dangerously low.
 
 With 13GB disks available for a couple of hundred bucks, my machines aren't
 going to run out of swap space any time soon, even if I commit to disk.
 
 All I want for Christmas is a knob to disable overcommit.
 
 --lyndon
 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral
Matthew Dillon wrote:
 
 Something is weird here.  If the solaris people are using a
 SWAPSIZE + REALMEM VM model, they have to allow the
 allocated + reserved space go +REALMEM bytes over available swap
 space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the swap -s command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the swap -s command.

 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap
 files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
c...@comp.cs.gunma-u.ac.jp, though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev
Daniel C. Sobral wrote:

 Eh? Reasonable programs *never* run into trouble. Trouble only
 happens when you have unreasonable programs around, or did not
 configure the system correctly. And if you did not configure the
 system correctly, why do you think you would be able to correctly
 estimate the stack needed for the various programs?

Your words are bad words. Exhausting of any of main resources - virtual
memory, disk space, process descriptors, file descriptors - is a terrible
situation, but one must not fight against headache with headcutting.
Every system can fall in uncontrolled state and eat all of some resource,
and kernel stack is to prevent process pool part from this, not to destruct
it. I had seen two boxes where swap was out misfortunately with bad results:
on first (FreeBSD 2.2.7), system kills the cron (sic!) process, on second
(Linux) syslogd, sendmail and some others became poisoned without any
warnings. It is totally bad behavior; kernel must be friend, not enemy.

Actions supposed enough by me for first (!) time:
1) Count in some kernel variables (readable by sysctl) overflows of virtual
memory, file descriptors, process descriptors and other critical resources.
This data must be available for watchdogs; for some systems, it is right to
reboot them immediately after some overflow, not to try to work in poisoned
state.
2) Run (in standard setup!) cron, syslogd and other important daemons from
special init slot (as Linux and possibly other systems allow), not from
startup scripts. Reason: they must be restarted when die without admin
intervention and without wrappers which can also be killed on memory low.
3) Declare thresholds for critical resources; for example, when more than
80% of virtual memory is used, prevent everybody except euid==0 or egid==0
from allocating new memory.
4) Provide special signal (SIGXMEM?) to send messages that there is memory
low and all have to shorten their memory. Daemons should interpret this
signal similarly to SIGHUP, with exec() itself and restart.

 Now comes the people saying don't overcommit in *this* case, and
 overcommit in *that* case. Irrelevant. Programs are still getting
 killed because memory was overcommitted (with the added disadvantage
 of you not having as much memory as in a full overcommit mode).

Kernel can kill processes that try to get unexistent memory. But when it did
not prevent system from falling into overflow, it plays unfair game.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen

jul...@whistle.com (Julian Elischer) writes:

 If you wanted to fix this, you could add a patch to malloc that touched
 every page that it handed to the application. (and trapped sig11s)

How would you expect that to work?

Several misunderstandings seem to be common regarding this issue (most
not directed at you):

 - malloc almost never fails with NULL.  This is not true, if resource
limits are set properly, any one program using huge amounts of memory
is going to hit them long before swap space is exhausted.

 - The program currently trying to get the page is the one that is
killed.

 - Actually paging in all memory is going to protect a program from
getting killed.  This is going to make it *more likely* for it to be
killed.

 - Not overcommitting doesn't consume huge amounts of reserve space
unless programs do something special.

A rough sum of memory usage can be computed by summing up all of the
process VSZs plus your stack limit times the number of processes.  How
many of you would be willing to configure that much swap space?

If you really wanted to run without overcommit, you'd only run
statically linked binaries and set your stack limits to small values.
This could be desirable for some (but not general-purpose) systems, an
option for doing this wouldn't be entirely bogus.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Ville-Pertti Keinonen

c...@netbsd.org (Chris G. Demetriou) writes:

 Matthew Dillon dil...@apollo.backplane.com writes:
  The text size of a program is irrelevant, because swap is never
  allocated for it.  The data and BSS are only relevant when they

No, you can mprotect read-only vnode mappings to writable.  Most
things wouldn't be hurt badly if this changed, though, I suspect that
this already varies between operating systems.

  are modified.
  
  The only thing swap is ever used for is the dynamic allocation of 
  memory.
  There are three ways to do it:  sbrk(), mmap(... MAP_ANON), or
  mmap(... MAP_PRIVATE).

 yup, almost: not all MAP_PRIVATE mappings need backing store, only
 MAP_PRIVATE and writeable mappings.  (MAP_PRIVATE does _not_ guarantee
 that you won't see modifications made via other MAP_SHARED mappings.)

...but in *this* case, you certainly shouldn't allow mprotect to fail
(with what, ENOMEM?).

It's certainly counterintuitive to me that mprotect could fail due to
a resource shortage.

 Actually, only now have you brought that up.  And, that's very system
 dependent.  On NetBSD/i386 the default is 2MB, and, it's worth noting
 that you only need to reserve as much as the current stack limit
 allows (after that, you're going to get a signal anyway, and if more

So what setrlimit accepts depends on how much memory is available?

Ok, programs changing their stack limit are rare, but this would still
be another API change.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Patrick Welche
Matthew Dillon wrote:
 
 :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
 : John Baldwin jobal...@vt.edu wrote:
 :
 :  What does that have to do with overcommit?  I student administrate a 
 undergrad
 :  CS lab at a university, and when student's programs misbehaved, they 
 generate a
 :  fault and are killed.  The only machines that reboot on us without be
 :  explicitly told to are the NT ones, and yes we run FreeBSD.
 :
 :What does it have to do with overcommit?  Everthing in the world!
 :
 :If you have a lot of users, all of which have buggy programs which eat
 :a lot of memory, per-user swap quotas don't necessarily save your butt.
 
 If every single one of your users is trying to crash your machine daily,
 maybe you should consider throwing them off the system and finding users
 that are less hostile.
 
 This conversation is getting silly.  Do you actually believe that
 an operating system can magically protect itself 100% from armloads of 
 hostile users?
 
 Give me a break.  You people are crazy.  If you have something worthwhile
 to say i'll listen, but these the sky is falling! arguments are idiotic.
 
   -Matt
 

students != hostile users

Making mistakes is part of learning.

Patrick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral
Patrick Welche wrote:
 
 students != hostile users

We obviously have known different students... :-)

 Making mistakes is part of learning.

A hostile user is one which will act in a non-friendly manner.
Whether intentionaly or not is irrelevant from the point of view of
the administrator, as far as protecting the system goes.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-16 Thread Valentin Nechayev
Daniel C. Sobral wrote:

  4.4BSD derived system cannot do this, and have to use different
  machine for such applications.

 Incorrect. We can set *limits* to the users, so they won't be able
 to crash down the system.

No. Really, not all users are used system in the same time. And it is too
cruel to set too small limits. And, average system has user limits quite
more than (total_resource*2/3)/n_users (2/3 is sub-optimal modifier). But,
if too many users began to use system, they can overflow the resource.
Group limits can make problem softer, but not more than a little.

I don't remember now English word for soft barrier, the Russian word is
'dempfer' ;) System must provide such soft barrier to prevent overflow long
far from the real overflow. Imho, 20% of typical critical resource must be
prevented.

--
Netch




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham


Daniel C. Sobral wrote:

  It would be nice to have a way to indicate that, a la SIGDANGER.
 
 Ok, everybody is avoiding this, so I'll comment. Yes, this would be
 interesting, and a good implementation will very probably be
 committed. *BUT*, this is not as useful as it seems. Since the
 correct solution is buy more memory/increase swap (correct solution
 for our target markets, anyway), there is little incentive to
 implement it.
 
 So, I think people who can answer the above is thinking like Well,
 it is useful, but it's not useful enough for me to spend my time on
 it, and I'm sure as hell don't want to write mini-papers on why it's
 not that useful.
 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee
On Fri, 16 Jul 1999, Matthew Dillon wrote:

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's 
 are
 certified for any of that stuff that I know of.
 
 What's next:  A space shot?  These what-if scenarios are getting
 ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

:   Well, NetBSD is slated to be used in the 'Space Acceleration
:   Measurement System II', measuring the microgravity environment on
:   the International Space Station using a distributed system based
:   on several NetBSD/i386 boxes.
:
:   Sometimes your 'what-if' senarios are others' standard operating
:   procedures.
:
:   David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.

None of the BSD's (nor NT, nor any other complex general purpose operating
system) are certified for critical systems in space.  The reason is
simple:  None of these operating systems can deal with memory faults 
caused by radiation.  You might see it for internal communications or
non-critical sensing, but you aren't going to see it for external
communications or thruster control.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn
On Fri, 16 Jul 1999, Matthew Dillon wrote:


:  Well, NetBSD is slated to be used in the 'Space Acceleration
:  Measurement System II', measuring the microgravity environment on
:  the International Space Station using a distributed system based
:  on several NetBSD/i386 boxes.
:
:  Sometimes your 'what-if' senarios are others' standard operating
:  procedures.
:
:  David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.


It's probably an awfully expensive experiment (putting things into space
is not cheap)


Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen
 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's 
 are
 certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
eisc...@vigrid.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon
: I'm sorry, but when you write code for a safety related system you
: do not dynamically allocate memory at all.  It's all essentially static.
: There is no issue with the memory resource.  Besides, none of the BSD's 
are
: certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 
dil...@backplane.com

:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:eisc...@vigrid.com
:



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Brian F. Feldman
Can we kill this thread already? This resolves nothing. The only good
to come of this is all of the nice doc-proj input Matt is providing
(and providing well, I might add.)

There is no point that hasn't been rehashed a dozen times over, and
you (the ones who want overcommitting turned off) are not helping
the S/N ratio.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 gr...@freebsd.org   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt
On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

 Technical follow-up:
 
 Contrary to what I previously said, a number of tests reveal that
 Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Michael Schuster - TSC SunOS Germany

Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-15 Thread Daniel C. Sobral

Kevin Schoedel wrote:
 
 Imagine a reasonably big
 program, like Netscape or Emacs, of which you usually just use a
 subset of features. There can easily be many megabytes of code and
 data in them you never actually use, or you don't _usually_ use
 (like the people who use emacs like it was vi :). Without
 overcommit, you need to allocate all that memory for the code, no
 matter whether you end up using it or not. With overcommit, there is
 no such problem.
 
 Code, static data, and not-yet-written writable data should be backed by
 the executable file, not by swap space, so unused code and tables should
 not be a problem.

TEXT should be backed by the executable, as long a the program
doesn't change it to read/write. That's not the code I was refering
to. Not-yet-written blah-blah-blah should be backed by:

1) The executable file if you are overcommitting.
2) RAM/Swap if you are not. If you don't do this, you are
overcommitting. Proof: let the system exaust it's memory. Change a
single byte in the not-yet-written stuff. Now you need more memory
than you have to comply with a regular operation (like changing the
value of a global variable), which means you overcommitted.

Now comes the people saying "don't overcommit in *this* case, and
overcommit in *that* case". Irrelevant. Programs are still getting
killed because memory was overcommitted (with the added disadvantage
of you not having as much memory as in a full overcommit mode).

 Stack is more interesting. There might be a place for a global overcommit
 switch. I think I'd be happier with a scheme in which stack the first
 page or first few pages are committed (so that reasonable programs will
 never run into trouble) and remaining stack is over-/un-committed by
 default, along with means for unusual programs to commit (and/or test
 commitability of) subsequent pages.

Eh? Reasonable programs *never* run into trouble. Trouble only
happens when you have unreasonable programs around, or did not
configure the system correctly. And if you did not configure the
system correctly, why do you think you would be able to correctly
estimate the stack needed for the various programs?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn

At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
If 1G isn't enough, spend another $30 and throw 2G of swap
online.  Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.

The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as "just casual questions" seem to be taken as "demands that
something be done, RIGHT NOW".

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my "casual questions" sometime in a different
context...

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda

 On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
 Uh... like any modern unix, Solaris overcommits.

 On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
"Eduardo E. Horvath" [EMAIL PROTECTED] said:

 Where do you guys get this misinformation?  
:
 Note the `19464k reserved'; that space has been reserved but not yet
 allocated.

Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki

On Wed, 14 Jul 1999, John Nemeth wrote:

 On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
 } Garance A Drosihn wrote:
 }  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
 }   In which case the program that consumed all memory will be killed.
 }   The program killed is +NOT+ the one demanding memory, it's the one
 }   with most of it.
 }  
 }  But that isn't always the best process to have killed off...
 } 
 } Sure it is. :-) Let's see...
 
  This statement is absurd.  Only a comptetant admin can decide
 which process can be killed.  No arbitrary decision is going to be
 correct.
 
 }  It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
"totally disposable, kill whenever you want", and class 1 means "never try
to kill me". Of course, most processes would get some default value, and
superuser could "renice" them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//  [EMAIL PROTECTED] WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small  Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k available
:
:[there has been a little background activity, but the numbers speak for themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug

 If this is correct, then solaris is using a VMSPACE = SWAPSPACE
 model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

Here is what I get from one of BEST's mail  www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  ---
0x870000x373000 738   738 rwx default   ---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  ---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  ---
0xefbde000 0xefbfe0001414 rwx COW swap  ---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon

 In that scenario, the 512MB of swap I assigned to this machine would be
 dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn



On Thu, 15 Jul 1999 17:53:52 CST, [EMAIL PROTECTED] wrote:

 All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message




re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green

   
All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted "wanting no overcommit" are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


: In that scenario, the 512MB of swap I assigned to this machine would be
: dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:[EMAIL PROTECTED]

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Michael Schuster - TSC SunOS Germany
Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
michael.schus...@germany.sun.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn
At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
If 1G isn't enough, spend another $30 and throw 2G of swap
online.  Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.

The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as just casual questions seem to be taken as demands that
something be done, RIGHT NOW.

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my casual questions sometime in a different
context...

---
Garance Alistair Drosehn   =   g...@eclipse.acs.rpi.edu
Senior Systems Programmer  or  dro...@rpi.edu
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda
 On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
 Uh... like any modern unix, Solaris overcommits.

 On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
Eduardo E. Horvath e...@one-o.com said:

 Where do you guys get this misinformation?  
:
 Note the `19464k reserved'; that space has been reserved but not yet
 allocated.

Both Dillon and Sobral mistakenly claimed that Solaris overcommits,
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
:Both Dillon and Sobral mistakenly claimed that Solaris overcommits,
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda
 On Thu, 15 Jul 1999 11:09:01 -0700 (PDT),
Matthew Dillon dil...@apollo.backplane.com said:

 Umm... how are you getting the reserved numbers? 

pstat -s on SunOS4, and swap -s on SunOS5. From Solaris man page:

:-s Print summary information  about  total  swap
:   space usage and availability:
:
:  allocated   The total amount of swap space
:  (in  1024-byte blocks)
:  currently allocated for use as
:  backing store.
:
:  reservedThe total amount of swap space
:  (in   1024-bytes  blocks)  not
:  currentlyallocated,but
:  claimed by memory mappings for
:  possible future use.
:
:  usedThe total amount of swap space
:  (in  1024-byte blocks) that is
:  either allocated or reserved.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:pstat -s on SunOS4, and swap -s on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki
On Wed, 14 Jul 1999, John Nemeth wrote:

 On Jul 15,  2:40am, Daniel C. Sobral wrote:
 } Garance A Drosihn wrote:
 }  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
 }   In which case the program that consumed all memory will be killed.
 }   The program killed is +NOT+ the one demanding memory, it's the one
 }   with most of it.
 }  
 }  But that isn't always the best process to have killed off...
 } 
 } Sure it is. :-) Let's see...
 
  This statement is absurd.  Only a comptetant admin can decide
 which process can be killed.  No arbitrary decision is going to be
 correct.
 
 }  It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
totally disposable, kill whenever you want, and class 1 means never try
to kill me. Of course, most processes would get some default value, and
superuser could renice them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//  ab...@webgiro.com WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small  Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k 
available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k 
available
:
:[there has been a little background activity, but the numbers speak for 
themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Jonathan Lemon
In article 
local.mail.freebsd-hackers/199907151825.laa11...@apollo.backplane.com you 
write:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

aladdin[5:32pm] prtconf
System Configuration:  Sun Microsystems  i86pc
Memory size: 128 Megabytes

aladdin[5:41pm] uname -a
SunOS aladdin 5.6 Generic_105182-14 i86pc i386


total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail
malloced 32MB...
total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail
touched...
total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail
forking...
total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail
touching again (parent)...
touching again (child)...
total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail
exiting...
exiting...
total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail

--
Jonathan


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug
 If this is correct, then solaris is using a VMSPACE = SWAPSPACE
 model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, sth...@nethelp.no


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon
Here is what I get from one of BEST's mail  www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  ---
0x870000x373000 738   738 rwx default   ---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  ---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  ---
0xefbde000 0xefbfe0001414 rwx COW swap  ---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon
 In that scenario, the 512MB of swap I assigned to this machine would be
 dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn


On Thu, 15 Jul 1999 17:53:52 CST, lyn...@orthanc.ab.ca wrote:

 All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green
   
All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted wanting no overcommit are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon
 And what I'm pretty sure the majority of the readers on this list want
 is for those of you who really think it's necessary to do it yourselves.
 
 What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
 wonder whether that's significant...

Sheldon, if you can't contribute something useful, then shut up.

If I have to do it myself, I will.



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

: In that scenario, the 512MB of swap I assigned to this machine would be
: dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Daniel C. Sobral
Technical follow-up:

Contrary to what I previously said, a number of tests reveal that
Solaris, indeed, does not overcommit. All non-read only segments,
and all malloc()ed memory is reserved upon exec() or fork(), and the
reserved memory is not allowed to exceed the total memory. It makes
extensive use of read only DATA segments, and has a NON_RESERVE
mmap() flag.

Though the foot firmly planted in my mouth ought to prevent me from
saying anything else, I must say that it does explain a few things
to me...

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:d...@newsguy.com

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Doug Rabson

On Tue, 13 Jul 1999, Jon Ribbens wrote:

 Alfred Perlstein [EMAIL PROTECTED] wrote:
  You're browsing with netscape and It hits about 32megs in size,
  you click on a multimedia object and netscape execs a helper app.
 
 vfork()
 
  you also have to consider a program wishing to make sparse use
  of its address space, without overcommit it becomes impossible.
 
 So Don't Do That Then.

Overcommit can be used for many reasons. I use it to reserve a large
linear address space to mmap alpha i/o spaces to which allows an efficient
implementation of inx/outx in user mode:

  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT   TIME COMMAND
0 43655 43652   7   2  0 12616584 12456 select S ??  1036:41.62 
/usr/X11R6/bin/X -auth /usr/X11R6/lib/X11/xdm/authdir/A:0-w43652

The X server is using 12G of address space..

--
Doug Rabson Mail:  [EMAIL PROTECTED]
Nonlinear Systems Ltd.  Phone: +44 181 442 9037




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon


:   Back on topic:
:
:   Obviously you devote the most time to handling the most common
:   and serious failure modes, but if someone else if willing to
:   put in the work to handle nightmare cases, should you ignore or
:   discard that work?

Of course not.  But nobody in this thread is even close to doing any
actual work and so far the two people I know who can (me and DG) aren't
particularly interested.  Instead they seem to want someone else to do
the work based on what I consider to be entirely unsubtantiated 
supposition.  Would you accept someone's unsupported and untested theories 
based almost entirely on a nightmare scenario to the exclusion of all
other possible (and more likely) problems?  I mean come on... read some 
of this stuff.  There are plenty of ways to solve these problems without
making the declaration that the overcommit model is flawed beyond repair,
and so far nobody has bothered to offer any counter-arguments to the 
resource management issues involved with actually *implementing* a 
non-overcommit model... every time I throw up hard numbers the only
response I get is a shrug-off with no basis in fact or experience noted
anywhere.  In the real world, you can't shrug of those sorts of problems.

I'm the only one trying to run hard numbers on the problem.  Certainly
nobody else is.  This is hardly something that would actually convince
me of the efficy of the model as applied to a UNIX kernel core.  Instead,
people are pulling out their favorite screwups and then blaming the 
overcommit model for all their troubles rather then looking for the
more obvious answer:  A misconfiguration or simply a lack of resources.
Some don't even appear to *have* any trouble with the overcommit model,
but argue against it anyway basing their entire argument on the
possibility that something might happen, again without bothering to 
calculate the probability or run any hard numbers. 

The argument is shifting from embedded work to multi-user operations to
*hostile* multi-user systems with some people advocating that a 
non-overcommit model will magically solve all their woes in these very
different scenarios, but can't be bothered with actually finding a 
real-life scenario or using an experience to demonstrate their position.

It is all pretty much garbage.  No wonder the NetBSD core broke up, if
this is what they had to deal with 24 hours a day!


:   Put more accurately - if someone wants to provide a different rope
:   to permit people to write in a different defensive style, and it
:   does not in any way impact your use of the system: More power to them.
:
:   David/absolute

As I've said on several occassions now, there is nothing in the current
*BSD design that prevents an embedded designer from implementing his or her
own memory management subsystem to support the memory requirements of
their programs.  The current UNIX out-of-memory kill scenario only occurs
as a last resort and it is very easy for an embedded system to avoid.  It
should be considered nothing more then a watchdog for catastrophic 
failure.  To implement the simplest non-overcommit system in the *BSD
kernel - returning NULL on an allocation failure due to non-availability 
of backing store - is virtually useless because it is just as arbitrary
as killing processes.  It might help a handful of people out of hundreds 
of thousands do something but they would do a lot better with a watchdog
script.  It makes no sense to try to build it into the kernel.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Robert Elz

Date:Tue, 13 Jul 1999 14:14:52 -0700 (PDT)
From:Matthew Dillon [EMAIL PROTECTED]
Message-ID:  [EMAIL PROTECTED]

  | If you don't have the disk necessary for a standard overcommit model to
  | work, you definitely do not have the disk necessary for a non-overcommit 
  | model to work.

This is based upon your somewhat strange definition of "work".   I assure
you that I have run many systems which don't use overcommit, and which I
quite frequently run into "out of VM" conditions, and which I can assure
you, work just fine.   When they're getting to run out of VM, the system
is approaching paging death, which is as you'd expect (they're overloaded).
That is, adding more VM (more swap space) would be counterproductive.

When this stage is reached, the absolute prime requirement of "working"
is met though - applications that request memory get that request refused,
but absolutely no processes get ungracefully killed.

In a sense, no-one really cares what the page allocation policy is, the
argument here isn't about overcommit, or the very conservative early BSD
version, or any of the intermediate possibilities - all people really care
about is what happens when resources are exhausted.   What happens until
then no-one really cares about (there are some issues of how much space
you need to dedicate to paging - most people would probably prefer to
not use the early BSD method, where you needed at least as much paging space
as RAM, or some of your RAM simply would be left idle).

But one absolute requirement for any system that wants to consider itself
to be a reliable useable, general purpose system, is that it never simply
randomly kill processes of its own volition.   If you're happy for random
processes to be killed on your workstation, that's fine, I'm not.   I run
processes which are intended to do specific work, they're not intended to
simply go away just because memory is running low (there are other processes,
stupid perl scripts and such, which will quite quickly die when a mem
request is refused, and return resources, so the processes that matter,
which can be very large, can keep on processing).

I have no doubt but that you can dream up scenarios where you pander to
the laziness of programmers, and make using huge VM space with little
of it actually allocated anywhere (or ever touched) then you would indeed
need monstrous amounts of paging space, most of which is never actually
used for anything - personally I prefer to have the programmers think
a little more about the memory footprint of their data structures.  Not
only does this reduce the VM footprint, it will also usually vastly
improving the paging characteristics.   Most applications which simply
scatter data through a huge VM space simply stop being useable as soon
as their RSS exceeds available physical memory - that is, if they start
paging, they die (become comatose might be a better description).
A little intelligent though as to how to actually make use of the mem
resources can make a huge difference.

There was an earlier comment on this thread (which no longer has the slightest
thing to do with the new version of grep...) which mentioned fortran
programs.   People, fortran (and huge fortran programs) has been around
much longer than VM has been.   There are lots of techniques for fortran
programmers to use to make use of restricted memory sizes, they've been
managing that for decades.

kre




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Niall Smart

 Maybe if I call the sysctl "vm.crashmenow".  No, that will just make more
 people actually try it.  It might be doable as a compile-time option,
 since you wouldn't be able to run anything approaching standard on
 such a system anyway.  I don't see much use for it myself.  As I said
 before, there are easier ways to manage memory that are not quite as
 arbitrary as simply refusing a potential overcommit.

Perhaps it could be an additional flag to mmap, in this way
people wishing to run an overcommited system could do so
but those writing programs which must not overcommit for
certain memory allocations could ensure they did not do so.

Regards,

Niall


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Daniel C. Sobral

Noriyuki Soda wrote:
 
 Running out of swap can be easily done by normal user privilege.
 Non-overcommiting system can run important application on the system
 which has a normal user, because it never lose critical data, even if
 a user on the system make a mistake. (The application might stop,
 but it never lose data.)
 
 4.4BSD derived system cannot do this, and have to use different
 machine for such applications.

Incorrect. We can set *limits* to the users, so they won't be able
to crash down the system.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Chris G. Demetriou

Doug Rabson [EMAIL PROTECTED] writes:
 Overcommit can be used for many reasons. I use it to reserve a large
 linear address space to mmap alpha i/o spaces [...]

Overcommit can be used for many reasons, but unless you've
misdescribed what you're doing, _that's not one of them_.

The mapped I/O pages need no backing store to be allocated for them by
the VM system.  They're backed by hardware.

And if you have 'placeholder' pages (I note that you didn't say you
mmap all of alpha i/o space, just reserve a large linear address space
in which to mmap it), then it should be possible to map them in such a
way (e.g. read-only ZFOD) in which they wouldn't count against backing
store requirements, either.



cgd
-- 
Chris Demetriou - [EMAIL PROTECTED] - http://www.netbsd.org/People/Pages/cgd.html
Disclaimer: Not speaking for NetBSD, just expressing my own opinion.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Brian F. Feldman

On Thu, 15 Jul 1999, Daniel C. Sobral wrote:

 "Charles M. Hannum" wrote:
  
  That's also objectively false.  Most such environments I've had
  experience with are, in fact, multi-user systems.  As you've pointed
  out yourself, there is no combination of resource limits and whatnot
  that are guaranteed to prevent `crashing' a multi-user system due to
  overcommit.  My simulation should not be axed because of a bug in
  someone else's program.  (This is also not hypothetical.  There was a
  bug in one version of bash that caused it to consume all the memory it
  could and then fall over.)
 
 In which case the program that consumed all memory will be killed.
 The program killed is +NOT+ the one demanding memory, it's the one
 with most of it.

So why don't we do something else: when we're down to a certain amount of
backing store, start collecting statistics. When we're out, we check the
statistics and find what process has been allocating most of it. We kill
that process.

 
 --
 Daniel C. Sobral  (8-DCS)
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
   "Would you like to go out with me?"
   "I'd love to."
   "Oh, well, n... err... would you?... ahh... huh... what do I do
 next?"
 

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Daniel C. Sobral

"Chris G. Demetriou" wrote:
 
...
 Overcommit avoidance may not be useful for your particular uses of
 these UNIX-like systems.  However, if you think that it's not useful
 to anybody who uses them (or that people who think it's useful are
 deluding themselves 8-), then you're sorely mistaken and have a
 ... very wrong-headed attitude about why people find such features
 useful.

Have you actually tried a system which can work in either overcommit
and non-overcommit modes?

What it comes down to is that if you have enough memory to run in
non-overcommit, you have enough memory to run in overcommit.

Setting limits is complex, but it is no more complex than correctly
sizing the memory in a non-overcommit system (this is demonstrable).

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Daniel C. Sobral

Matthew Dillon wrote:
 
 :
 :Heh, really?  The camera ships w/ Apache running on it.
 :
 :-- Jason R. Thorpe [EMAIL PROTECTED]
 
 They obviously have a lot of memory to play with, then.  Or they
 are crazy.  Writing a web server is fairly easy to do.  I've
 written several, including the one that BEST runs on most of its
 servers.

For the record, professional digital cameras go into the $100K
range, so I'd be expecting it not only to run Apache, but also to
come with Doom. :-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Daniel C. Sobral

Jason Thorpe wrote:
 
   There is a lot of hidden 'potential' VM that you haven't considered.
   For example, if the resource limit for a process's stack is 8MB, then
   the process can potentially allocate 8MB of stack even though it may
   actually only allocate 32K of stack.  When a process forks, the child
 
 ...um, so, make the code that deals with faulting in the stack a bit smarter.

Uh? Like what? Like overcommitting, for instance? The beauty of
overcommitting is that either you do it or you don't. :-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Garance A Drosihn

At 12:00 PM -0400 7/14/99, Brian F. Feldman wrote:
 So why don't we do something else: when we're down to a certain
 amount of backing store, start collecting statistics. When we're
 out, we check the statistics and find what process has been
 allocating most of it. We kill that process.

Not that I'm really commenting on the above idea (although it does
sound fine to me), this reminds me about an earlier thread.  Is there
any interest in us (BSD's) having a SIGDANGER signal like some other
OS's do?  That way, key processes (like sshd) could at least make it
less likely that THEY are the process which is killed.

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral

"Brian F. Feldman" wrote:
 
  In which case the program that consumed all memory will be killed.
  The program killed is +NOT+ the one demanding memory, it's the one
  with most of it.
 
 So why don't we do something else: when we're down to a certain amount of
 backing store, start collecting statistics. When we're out, we check the
 statistics and find what process has been allocating most of it. We kill
 that process.

Because it's not only equally arbitrary but also takes more
resources to implement?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Julian Elischer

If you wanted to fix this, you could add a patch to malloc that touched
every page that it handed to the application. (and trapped sig11s)


On Wed, 14 Jul 1999 [EMAIL PROTECTED] wrote:

 
  I mean, jeeze, the reservation for the program stack alone would eat
  up all your available swap space!  What is a reasonable stack size?  The
  system defaults to 8MB.  Do we rewrite every program to specify its own
  stack size?  How do we account for architectural differences?  
 
 The alternative is to rewrite every program that assumes the semantics
 of malloc() are being followed. The problem I have as an applications
 writer is that I tend to believe malloc. To pick a specific example,
 our IMAP client takes steps to ensure it won't run out of memory in
 critical sections. We maintain a "rainy day" pool block of memory. If
 we receive a NULL from malloc, we 1) free up whatever memory we can
 in other parts of the client (possibly using the rainy day pool to
 stage data out to disk), and 2) if necessary, reduce the size of the
 rainy day pool. This whole design is predicated on malloc() telling
 the truth. If instead it gives us a bogus block of memory, then
 seg faults when we try to use it, the best we can do is try to shut
 down without losing any of the users mail (and in fact we don't
 even do that, since there are just too many places where this can
 happen in third-party libraries that we aren't willing to rewrite).
 Sending us a kill signal is even worse. (And extremely unfair, since
 we take pains to not waste memory in the first place.)
 
 Has anyone analyzed all those applications people talk about that
 show huge allocation footprints but don't actually use the memory?
 That represents the code that needs to be fixed. Breaking malloc()
 is not a suitable response IMO.
 
 As a data point, we routinely disable overcommit on our SGI machines
 and it doesn't hurt us one bit. And we aren't allocating gigabytes
 of swap space, either.
 
 --lyndon
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Brian F. Feldman

You don't seem to understand that a runaway process/one designed just
to take up memory will be much more active than your little IMAP servers,
and be the one killed, if this scheme were used.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread lyndon


 You don't seem to understand that a runaway process/one designed just
 to take up memory will be much more active than your little IMAP servers,
 and be the one killed, if this scheme were used.

No, what I don't understand is how the current behaviour can tell that
my temporary and *valid* need for a large chunk of memory does not make
me a runaway process, and therefore subject to death.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Doug Rabson

On 14 Jul 1999, Chris G. Demetriou wrote:

 Doug Rabson [EMAIL PROTECTED] writes:
  Overcommit can be used for many reasons. I use it to reserve a large
  linear address space to mmap alpha i/o spaces [...]
 
 Overcommit can be used for many reasons, but unless you've
 misdescribed what you're doing, _that's not one of them_.
 
 The mapped I/O pages need no backing store to be allocated for them by
 the VM system.  They're backed by hardware.
 
 And if you have 'placeholder' pages (I note that you didn't say you
 mmap all of alpha i/o space, just reserve a large linear address space
 in which to mmap it), then it should be possible to map them in such a
 way (e.g. read-only ZFOD) in which they wouldn't count against backing
 store requirements, either.

I certainly don't need or want backing store for these pages. The original
reserved region is never touched without first mapping device pages onto
it.

--
Doug Rabson Mail:  [EMAIL PROTECTED]
Nonlinear Systems Ltd.  Phone: +44 181 442 9037




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread David Brownlee

On Thu, 15 Jul 1999, Daniel C. Sobral wrote:

 For the record, professional digital cameras go into the $100K
 range, so I'd be expecting it not only to run Apache, but also to
 come with Doom. :-)

Well you have 16MB RAM, 32MB flash memory, a network interface,
other bits and NetBSD for ~ $1600. Find yourself a remote display
and fire up your compiler :)

http://www.brains.co.jp/mmeye/index-e.html


David/absolute

 -=-  "Just adding to the wrinkles on his deathly frown"  -=-





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Michael Richardson


 "John" == John Nemeth [EMAIL PROTECTED] writes:
John On one system I administrate, the largest process is typically
John rpc.nisd (the NIS+ server daemon).  Killing that process would be a
John bad thing (TM).  You're talking about killing random processes.
John This is no way to run a system.  It is not possible for any
John arbitrary decision to always hit the correct process.  That is a
John decision that must be made by a competent admin.  This is the
John biggest argument against overcommit: there is no way to gracefully
John recover from an out of memory situation, and that makes for an
John unreliable system.

  No, I don't agree. 

  This is a biggest argument against solving the overcommit situation with
SIGKILL. I have no problem with overcommit as a concept, I have a problem
with being unable to keep my possibly big processes (X, rpc.nisd,
etc. depending on cicumstances) from being victims.

] Train travel features AC outlets with no take-off restrictions|  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON|net architect[
] [EMAIL PROTECTED] http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread John Nemeth

On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
} Garance A Drosihn wrote:
}  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
}   In which case the program that consumed all memory will be killed.
}   The program killed is +NOT+ the one demanding memory, it's the one
}   with most of it.
}  
}  But that isn't always the best process to have killed off...
} 
} Sure it is. :-) Let's see...

 This statement is absurd.  Only a comptetant admin can decide
which process can be killed.  No arbitrary decision is going to be
correct.

}  It would be nice to have a way to indicate that, a la SIGDANGER.
} 
} Ok, everybody is avoiding this, so I'll comment. Yes, this would be

 The reason I've ignored it, is because SIGDANGER is a hack on top
of a very bad hack.

} interesting, and a good implementation will very probably be
} committed. *BUT*, this is not as useful as it seems. Since the
} correct solution is buy more memory/increase swap (correct solution
} for our target markets, anyway), there is little incentive to
} implement it.

 In case you hadn't noticed, this debate is cross-posted to
NetBSD.  NetBSD's target market isn't the same as FreeBSD's target
market.  This answer is NOT the correct solution for NetBSD's target
market.  Heck, except for one rather vocal person, FreeBSD's target
market may not consider it to be the correct solution either.  I most
certainly do not consider it to be correct, and I admin a lot of
mission critical servers.

}-- End of excerpt from "Daniel C. Sobral"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Michael Richardson


 "Ben" == Ben Rosengart [EMAIL PROTECTED] writes:
Ben On Wed, 14 Jul 1999, John Nemeth wrote:

 On one system I administrate, the largest process is typically
 rpc.nisd (the NIS+ server daemon).  Killing that process would be a
 bad thing (TM).  You're talking about killing random processes.  This
 is no way to run a system.  It is not possible for any arbitrary
 decision to always hit the correct process.  That is a decision that
 must be made by a competent admin.  This is the biggest argument
 against overcommit: there is no way to gracefully recover from an out
 of memory situation, and that makes for an unreliable system.

Ben $DEITY on a pogo stick, how many times do we have to hear the same
Ben hypothetical argument?

Ben Tell me, Mr. Nemeth, has this ever happened to you?  Have you ever
Ben come *close*?

  Uh, since we don't run overcommit, the answer is specifically *NO*.

  We have never had lack of swap space randomly kill one of our processes.
This is good, and this is the way we want to keep it. 

  I have had it happen on other systems. (Solaris, AIX) It was very
mystifying to diagnose. Sure, the systems were misconfigured for what we
were trying to do, but if I wanted build a custom system for every
application well... I'd be running NT.

] Train travel features AC outlets with no take-off restrictions|  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON|net architect[
] [EMAIL PROTECTED] http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Robert Elz

Date:Thu, 15 Jul 1999 00:53:17 +0900
From:"Daniel C. Sobral" [EMAIL PROTECTED]
Message-ID:  [EMAIL PROTECTED]

  | Would you care to name such systems?

munnari was one (the system of the From: header, even though this
mail isn't actually going anywhere near it).   I will describe it
a bit lower down.

  | And, btw, a system consuming
  | all memory is *not* necessarily approaching paging death.

No, of course not, though I didn't say all memory, I said all VM.
And while it is possible to have all VM consumed, and no paging activity
at all, that would tend to indicate insufficient VM allocated
(reaching an artificial barrier).

  | More
  | likely, it is just storing a lot of data in the swap which will
  | never be used (which is the whole point of overcommit in first
  | place), and, thus, never paged in.

The systems I describe were not using overcommit,  further, I wouldn't
imagine that a system storing anything to swap would be overcommiting - as
I understand the term, overcommit only relates to allocating VM resources
which aren't backed by anything physical at all ("here's all this
address space you can play in if you like, but you had better not
actually do that, because if you do it won't work").   Either applied
to one process, as that wording suggests, or aggregated over the whole
system.   If a process was (for some stupid reason) loading a whole
bunch of data into the swap space, that would be committed VM, and you
have to have the resources to cope with it.

Now to munnari.   It no longer runs quite like this, but munnari is
an alpha, 128MB, runs digital unix (not in overcommit mode, either is
possible there).   At the time of which I speak it ran two principal
applications of note, innd with a VM footprint about 100MB, and named,
with a memory footprint (at the time) of about 90MB (as it is now, it
no longer runs innd, but its named has grown to  120MB).

It also ran a bunch of small stuff (sendmail, typically 1 or 2 instances,
around 3MB each), ftpd (smaller, most often 0 or 1, sometimes 3 or 4,)
and the occasional shell (a few hundreds of MB) plus init getty cron
syslog and all that associated noise with mem requirements approaching 0.

That's fine.  Well, not really fine, innd and named would fight each
other all day for who had how much of the real memory, and who was
relegated to swap, of which there was enough for all this to fit, but
not a lot more than that (enough for one of them to fork when it
needed to, that's all - not both at once, and yes, overcommit would
have allowed both at once, but that was not an aim).

Then, because it was running innd, it was also running the perl script
that summarises the log file, that could grow to 30MB, maybe more.

And because it is running sendmail, every now and then you get the
typical sendmail huge queue syndrome (at least for old sendmails, which
this was), where you get a dead site, a large queue of processes, and
a bunch of sendmails running the queue, spending most of their time
hung on connection attempts that aren't working, and gradually growing
bigger (maybe 8 or 10 processes at 15Mb each).

Somewhere amongst all of this swap would run out, and a good thing too,
as by this time the system really would be paging itself to oblivion.
Note that all this (large) VM I have described was filled with real data
(except for the odd times hen innd or named had just forked), none of it
could be overcommitted and just ignored.   Whatever policy was in place,
the physical VM resources would have run out.

Now let's look at what happens with the two methods.

With all VM backed by real mem or swap space, processes go about allocating
memory - when there is no more left, the allocations start failing.
If the process is perl, it just collapses in a heap, and the log file
summary doesn't get made that day.   So sad...   If its sendmail, it
issues "OS error, temporary failure" type responses, saves its queue files,
and exits.   A later sendmail will deliver those messages, no harm.
If its a shell, who knows (I forget what the shells do, I think most just
keep trying, at least if interactive), but they consume mem at such a slow
rate it doesn't matter - fork() would typically fail though, so no new
processes could get started.   innd would just pause, and wait till a
bit later when mem might be available again (those perls and sendmails
all gone away).   named just the same (at least the named munnari ran).
They're the two processes munnari was supposed to be runinng - those two
don't just die.

Now, with overcommit mode, we get an extra 30 seconds of life, because
no doubt there are a few pages floating around that have been allocated
to some process, but nothing has bothered to write into yet.   An extra 30
seconds if we're lucky (except if we followed the advice given here
earlier which would indicate that only 1/8 the amount of swap space would
be needed, in which case these processes would never have gotten started
in 

Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon

:Now let's look at what happens with the two methods.
:
:With all VM backed by real mem or swap space, processes go about allocating
:memory - when there is no more left, the allocations start failing.
:If the process is perl, it just collapses in a heap, and the log file
:summary doesn't get made that day.   So sad...   If its sendmail, it
:issues "OS error, temporary failure" type responses, saves its queue files,
:and exits.   A later sendmail will deliver those messages, no harm.
:If its a shell, who knows (I forget what the shells do, I think most just
:keep trying, at least if interactive), but they consume mem at such a slow
:rate it doesn't matter - fork() would typically fail though, so no new
:processes could get started.   innd would just pause, and wait till a
:bit later when mem might be available again (those perls and sendmails
:all gone away).   named just the same (at least the named munnari ran).
:They're the two processes munnari was supposed to be runinng - those two
:don't just die.

Which means that if one of those two processes happen to be the ones
primarily responsible for running the machine out of VM, memory resources
will never be released and now you can't even login!  Not only that, but 
if you are running a news subsystem, it is actually *worse* if the news
process bogs down and gets behind then it for the news process to simply 
die and alert someone.   When you are pushing news, you cannot afford to
get behind.

Also, your named is badly misconfigured if it grows to 130MB.  We never
allow ours to grow past 30MB.

Since the machine is basically in an unworking state anyway, and since
you can now no longer login, I don't quite see why you are happy that
those two processes are still running.  From my standpoint, the machine
is badly broken and needs to be rebooted and then fixed so the problems
do not reoccur and I would be much happier if I could log into the beast
to get that done then to have to hit the reset button.

:Now, with overcommit mode, we get an extra 30 seconds of life, because
:no doubt there are a few pages floating around that have been allocated
:to some process, but nothing has bothered to write into yet.   An extra 30
:... garbage removed ...
:Sure it would get lots of VM back again, but the system would no longer
:have been doing what it was supposed to be doing.   Adding more swap space

The machine isn't doing what it is supposed to be doing in either case
once it has run out of VM.  Except in the first case you think you should
be happy because it didn't kill the news process, when in fact you ought
to be trying to figure out why the thing ran out of VM in the first place
and then fix it so it never happens again.

To me, this whole scenario sounds like a badly configured machine which
the sysop isn't willing to fix.  I feel sorry for the poor company who 
hired that sysop!

:would be easy, but the wrong thing to do, that would just have allowed
:the system to page itself to death, thrashing into eternity - having
:processes go away is the only solution to this kind of problem.   Except
:it needs to be the right processes, and "right" does not equal "big",
:nor any other criteria the kernel could possibly figure out for itself.
:
:kre

If you consider this a critical problem, then the only acceptable solution
is to write a watchdog script that monitors swap utilization and kills
the correct processes if swap starts to get low.  If you wait until swap
actually runs out, you've already lost because too many things are likely
to break in a general purpose computing environment.  Of course I suppose
you could advocate that programs must be written 'properly' to handle 
the case... well, more power to you, but in a general computing environment
you are running dozens if not hundreds of third party applications and
fixing them all is a pipe dream.

It seems to me that you are willing to blame the operating system for
a situation that is really not the OS's fault, and that you are not willing
to sit down and spend the 10 minutes necessary writing a simple watchdog
script.

I don't bother to write watchdog scripts to check for swap, because my
machines DO NOT RUN OUT OF SWAP.  If your machines do, then maybe you
should consider writing the watchdog script.  Personally, I think you would
get better reliability by fixing your systems.

You are blaming what is essentially a last-resort effort by the kernel for
not being nice to your processes.  Well Duh!  It's a last-resort mechanism,
it isn't supposed to be nice.  Maybe you shouldn't be depending on last
resort mechanisms to keep your machines running.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail 

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Jason Thorpe

On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
 John Baldwin [EMAIL PROTECTED] wrote:

  What does that have to do with overcommit?  I student administrate a undergrad
  CS lab at a university, and when student's programs misbehaved, they generate a
  fault and are killed.  The only machines that reboot on us without be
  explicitly told to are the NT ones, and yes we run FreeBSD.

What does it have to do with overcommit?  Everthing in the world!

If you have a lot of users, all of which have buggy programs which eat
a lot of memory, per-user swap quotas don't necessarily save your butt.

And maybe the individual programs didn't encounter their resource limits.

...but the sheer number of these runaway things caused the overcommit to
be a problem.  If malloc() or whatever had actually returned NULL at the
right time (i.e. as backing store was about to become overcommitted), then
these runaway processes would have stopped running away (they would have
gotten a SIGSEGV and died).

Anyhow, my "lame undergrads" example comes from a time when PCs weren't
really powerful enough for the job (or something; anyhow, we didn't have
any in the department :-).  My example is from a Sequent Balance (16
ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant).

-- Jason R. Thorpe [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread sthaug

 Also, your named is badly misconfigured if it grows to 130MB.  We never
 allow ours to grow past 30MB.

How do you know what kind of name server configuration kre is running?
Here's an example of a name server running *non-recursive*, serving
11.500 zones:

  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
27162 root   2070M   57M sleep 271:01  3.27%  3.27% named

Are you saying that such configurations should be illegal?

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon


:
: Also, your named is badly misconfigured if it grows to 130MB.  We never
: allow ours to grow past 30MB.
:
:How do you know what kind of name server configuration kre is running?
:Here's an example of a name server running *non-recursive*, serving
:11.500 zones:
:
:  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
:27162 root   2070M   57M sleep 271:01  3.27%  3.27% named
:
:Are you saying that such configurations should be illegal?
:
:Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]

I assumed that since the guy said that his named GREW, that he was
running a recurisve/caching named.

Obviously if you are running a non-recursive named the static size
will depend on the zones you are serving.  Duh!

It is not generally beneficial to allow a caching named to exceed 30MB
or so on a system that is doing other things.  If the system starts to
page (which this person's system is obviously doing), then it is doubly
a bad idea to allow a named to grow that large.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon

:On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
: John Baldwin [EMAIL PROTECTED] wrote:
:
:  What does that have to do with overcommit?  I student administrate a undergrad
:  CS lab at a university, and when student's programs misbehaved, they generate a
:  fault and are killed.  The only machines that reboot on us without be
:  explicitly told to are the NT ones, and yes we run FreeBSD.
:
:What does it have to do with overcommit?  Everthing in the world!
:
:If you have a lot of users, all of which have buggy programs which eat
:a lot of memory, per-user swap quotas don't necessarily save your butt.

If every single one of your users is trying to crash your machine daily,
maybe you should consider throwing them off the system and finding users
that are less hostile.

This conversation is getting silly.  Do you actually believe that
an operating system can magically protect itself 100% from armloads of 
hostile users?

Give me a break.  You people are crazy.  If you have something worthwhile
to say i'll listen, but these "the sky is falling!" arguments are idiotic.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Jason Thorpe

On Wed, 14 Jul 1999 12:43:07 + 
 Niall Smart [EMAIL PROTECTED] wrote:

  Perhaps it could be an additional flag to mmap, in this way
  people wishing to run an overcommited system could do so
  but those writing programs which must not overcommit for
  certain memory allocations could ensure they did not do so.

This has already been mentioned.  SVR4 has MAP_NORESERVE specifcally
for this purpose.

-- Jason R. Thorpe [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Jason Thorpe

On Thu, 15 Jul 1999 01:52:11 +0900 
 "Daniel C. Sobral" [EMAIL PROTECTED] wrote:

   ...um, so, make the code that deals with faulting in the stack a bit smarter.
  
  Uh? Like what? Like overcommitting, for instance? The beauty of
  overcommitting is that either you do it or you don't. :-)

One option is to special-case overcommit the stack.  Another is to
set the default stack limits to something more reasonable on a system
where overcommit is disabled.

-- Jason R. Thorpe [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Jason Thorpe

On Thu, 15 Jul 1999 01:59:12 +0900 
 "Daniel C. Sobral" [EMAIL PROTECTED] wrote:

   That's why you make it a switch.  No, really, you *can* just make it
   a switch.
  
  So, enlighten me, please... how do you switch it in NetBSD?

When the code to do it is implemented (not that hard, really, and it is
in the list of things to do with UVM), a sysctl will enable/disable
overcommit checking.  There would be like 4 or 5 places in the code
where this boolean switch would have to be tested.

-- Jason R. Thorpe [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Swap subsystem overhead (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Nik Clayton

On Tue, Jul 13, 1999 at 05:12:30PM -0700, Matthew Dillon wrote:
 Ok, I will be more specific.
 
 Under FreeBSD-STABLE *AND* FreeBSD-CURRENT, FreeBSD allocates metadata
 structures that scale to the amount of swap space assigned to the system.
 However, it is not *precisely* the amount of swap space.

snip

 Under FreeBSD-stable, just look under "VM pgdata" to see how much 
 memory is being wired to support the swap subsystem.  This usage covers
 both the fixed and dynamic allocations.

OK, at the risk of reawakening that particular thread -- if people are a 
little uneasy about Matt committing to src/*, how about letting him commit 
to doc/* instead?

Matt -- some of these messages of yours could probably turn in to great
articles for DaemonNews, or the FreeBSD 'zine, if you were that way 
inclined. . .

N
-- 
 [intentional self-reference] can be easily accommodated using a blessed,
 non-self-referential dummy head-node whose own object destructor severs
 the links.
-- Tom Christiansen in [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon

:
:One option is to special-case overcommit the stack.  Another is to
:set the default stack limits to something more reasonable on a system
:where overcommit is disabled.
:
:-- Jason R. Thorpe [EMAIL PROTECTED]

Try setting all the resource limits to something reasonable on general
principles.  It would work as well in an overcommit system as it would 
in a non-overcommit system.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Nate Williams

[ Trimmed CC list a bit ]

  :* even if you are not willing to pay that price, there _are_ people
  :who are quite willing to pay that price to get the benefits that they
  :see (whether it's a matter of perception or not, from their
  :perspective they may as well be real) of such a scheme.
  
  Quite true.  In the embedded world we preallocate memory and shape
  the programs to what is available in the system.  But if we run out
  of memory we usually panic and reboot - because the code is designed
  to NOT run out of memory and thus running out of memory is a catastrophic
  situation.

*ACK*  This is unacceptable in many 'embedded' systems.

 There's a whole spectrum of embedded devices, and applications that
 run on them.  That definition works for some of them, but definitely
 not all.
 

Totally agreed.  A previous poster brought up the fact that *some*
embedded systems are built to deal with 'out of memory' situations, and
that the 'total' amount of memory used in the system can be used by
other parts of the system.

For performance reasons, a particular application may choose to 'cache'
data, but in low memory situation it can 'free' up alot of memory.  You
don't want to put hard-coded limits the process simply because if the
memory is there you want it to be able to use it, but you *certainly*
don't want to go through a reboot just to get memory back.

[ And, I don't want to write my own OS to do this for me. :) ]

(However, I agree that for general purpose computing, over-commit is the
way to go.  But, *BSD is not just for general purpose computing,
although that is it's primary market.)



Nate


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Replacement for grep(1) (part 2)

1999-07-14 Thread Matthew Dillon

:  
:  Quite true.  In the embedded world we preallocate memory and shape
:  the programs to what is available in the system.  But if we run out
:  of memory we usually panic and reboot - because the code is designed
:  to NOT run out of memory and thus running out of memory is a catastrophic
:  situation.
:
:*ACK*  This is unacceptable in many 'embedded' systems.

Don't confuse a watchdog panic from other conditions.  If the embedded
system software is supposed to deal with a low-memory condition and can't,
the failsafe is all that's left between it and infinity.

The statement that the kernel's overcommit methodology somehow prevents
one from being able to build embedded systems on top of it is just plain
incorrect.  The embedded system is perfectly capable of implementing its
own memory management to avoid the filesafe provided by the kernel.

Most of the embedded work I've done -- mainly remote telemetry units
running with flash and a megabyte or so of ram -- panic and reboot if they
run out of memory.  I have several dozen units in the field each keeping
track of several thousand data points on 2 minute intervals which have
not ever crashed.  The only time we reboot them is when we need to upgrade
the OS core.  The last time was 4 years ago.  *These* units will panic 
and reboot if they run out of memory because the software is designed not
to.  It is as simple as that.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



  1   2   3   4   5   >