subject:"Re\: Swap overcommit"

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi



[cc: list trimmed]

On Thu, 15 Jul 1999 [EMAIL PROTECTED] wrote:

  In that scenario, the 512MB of swap I assigned to this machine would be
  dangerously low.
 
 With 13GB disks available for a couple of hundred bucks, my machines aren't
 going to run out of swap space any time soon, even if I commit to disk.
 
 All I want for Christmas is a knob to disable overcommit.
 
 --lyndon
 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral


Matthew Dillon wrote:
 
 Something is weird here.  If the solaris people are using a
 SWAPSIZE + REALMEM VM model, they have to allow the
 allocated + reserved space go +REALMEM bytes over available swap
 space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the "swap -s" command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the "swap -s" command.

 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap
 files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
[EMAIL PROTECTED], though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham




"Daniel C. Sobral" wrote:

  It would be nice to have a way to indicate that, a la SIGDANGER.
 
 Ok, everybody is avoiding this, so I'll comment. Yes, this would be
 interesting, and a good implementation will very probably be
 committed. *BUT*, this is not as useful as it seems. Since the
 correct solution is buy more memory/increase swap (correct solution
 for our target markets, anyway), there is little incentive to
 implement it.
 
 So, I think people who can answer the above is thinking like "Well,
 it is useful, but it's not useful enough for me to spend my time on
 it, and I'm sure as hell don't want to write mini-papers on why it's
 not that useful".
 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon



:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-16 Thread Garance A Drosihn


At 9:52 PM -0700 7/15/99, Matthew Dillon wrote:
: ...  How many programmers bother to even *clear* errno before
: making these calls (since some system calls do not set errno
:  
: if it already non-zero).  Virtually nobody.
:  ^^^
:
:Erm... WTF?!?! If so, why the HELL are we doing that?!?

No, wait, I got that wrong I think.

Oh yah, I remember now.  Hmm.  How odd.  I came across a case
where read() could return -1 and not set errno properly if
errno was already set, but a perusal of the kernel code seems
to indicate that this can't happen.  Very weird.

For what it's worth, I know I've run into situations where errno
had to be cleared before calling some system routine (but I don't
think it was read, and I am sure it wasn't on freebsd).

As I remember it, it was some case where "sysrtn1" called another
system routine (and you would be calling "sysrtn1").  If the call
to the inner system routine failed, then the inner routine would
set errno and "sysrtn" would return it's own error.  However,
"sysrtn1" would return the SAME error in some other circumstances,
circumstances which did not set errno.  So, if you wanted to check
errno when you got an error return from "sysrtn1", you had to be
sure to zero it out before calling "sysrtn1".

This was a lesson taught after a long wild-goose chase trying
to track down the wrong reason for an error-return from "sysrtn1"
(whatever that routine was...), because we had NOT zeroed out
errno first.

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee


On Fri, 16 Jul 1999, Matthew Dillon wrote:

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's are
 certified for any of that stuff that I know of.
 
 What's next:  A space shot?  These what-if scenarios are getting
 ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn


On Fri, 16 Jul 1999, Matthew Dillon wrote:


:  Well, NetBSD is slated to be used in the 'Space Acceleration
:  Measurement System II', measuring the microgravity environment on
:  the International Space Station using a distributed system based
:  on several NetBSD/i386 boxes.
:
:  Sometimes your 'what-if' senarios are others' standard operating
:  procedures.
:
:  David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.


It's probably an awfully expensive experiment (putting things into space
is not cheap)

From a financial viewpoint that may be considered critical.

Cheers,

Al


--
Alan Horn - Sysadmin - Dreamworks (+1 818 695 6256) - [EMAIL PROTECTED]
  I am Connor MacLeod of the Clan MacLeod. I was born in 1518 in the
village of Glenfinnan on the shores of Loch Sheil, and I am immortal.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen


 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's are
 certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


: I'm sorry, but when you write code for a safety related system you
: do not dynamically allocate memory at all.  It's all essentially static.
: There is no issue with the memory resource.  Besides, none of the BSD's are
: certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:[EMAIL PROTECTED]
:



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt


On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

 Technical follow-up:
 
 Contrary to what I previously said, a number of tests reveal that
 Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Dominic Mitchell

On Thu, Jul 15, 1999 at 09:57:31PM -0700, Matthew Dillon wrote:
 Something is weird here.  If the solaris people are using a 
 SWAPSIZE + REALMEM VM model, they have to allow the 
 allocated + reserved space go +REALMEM bytes over available swap 
 space.  If not they are using only a SWAPSIZE VM model.
 
 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
 files and allows holes then that explains everything.

No, swap is slice based in Solaris.  tmpfs is just a filesystem (much
like MFS) which uses swap as backing store.  I will admit to never quite
understanding the relationship of how much swap tmpfs is willing to
steal though...  Maybe I should go and read the answerbook
(http://docs.sun.com if you want a peek).
-- 
Dom Mitchell -- Palmer  Harvey McLane -- Unix Systems Administrator

In Mountain View did Larry Wall
Sedately launch a quiet plea:
That DOS, the ancient system, shall
On boxes pleasureless to all
Run Perl though lack they C.
-- 
**
This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they   
are addressed. If you have received this email in error please notify 
the system manager.

This footnote also confirms that this email message has been swept by 
MIMEsweeper for the presence of computer viruses.
**


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Narvi


[cc: list trimmed]

On Thu, 15 Jul 1999 lyn...@orthanc.ab.ca wrote:

  In that scenario, the 512MB of swap I assigned to this machine would be
  dangerously low.
 
 With 13GB disks available for a couple of hundred bucks, my machines aren't
 going to run out of swap space any time soon, even if I commit to disk.
 
 All I want for Christmas is a knob to disable overcommit.
 
 --lyndon
 

CVSup the source repository and start writing.

Sander

There is no love, no good, no happiness and no future -
all these are just illusions.



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral

Matthew Dillon wrote:
 
 Something is weird here.  If the solaris people are using a
 SWAPSIZE + REALMEM VM model, they have to allow the
 allocated + reserved space go +REALMEM bytes over available swap
 space.  If not they are using only a SWAPSIZE VM model.

I did not check if the model was a SWAPSIZE+REALMEM or a SWAPSIZE
model. Anyway, I think you are assuming that the swap -s command
shows as total memory just the swap space... Maybe, maybe not. I
don't know. But the space against which I reached the ceiling *was*
the one reported in the swap -s command.

 Wait - does Solaris normally use swap files or swap partitions?
 Or is it that weird /tmp filesystem stuff?  If it normally uses swap
 files and allows holes then that explains everything.

I'd say partitions. While perusing man pages, I caught briefly the
comment that a swap partition could overwrite a normal partition, in
a man page about a special command to create swap partitions.

Anything you'd like me to check in particular? If you have any
source code you'd like me to run, just send it to
c...@comp.cs.gunma-u.ac.jp, though I can only run them at the
earliest on monday. Well, at least my monday is your sunday night...
:-)

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Patrick Welche

Matthew Dillon wrote:
 
 :On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
 : John Baldwin jobal...@vt.edu wrote:
 :
 :  What does that have to do with overcommit?  I student administrate a 
 undergrad
 :  CS lab at a university, and when student's programs misbehaved, they 
 generate a
 :  fault and are killed.  The only machines that reboot on us without be
 :  explicitly told to are the NT ones, and yes we run FreeBSD.
 :
 :What does it have to do with overcommit?  Everthing in the world!
 :
 :If you have a lot of users, all of which have buggy programs which eat
 :a lot of memory, per-user swap quotas don't necessarily save your butt.
 
 If every single one of your users is trying to crash your machine daily,
 maybe you should consider throwing them off the system and finding users
 that are less hostile.
 
 This conversation is getting silly.  Do you actually believe that
 an operating system can magically protect itself 100% from armloads of 
 hostile users?
 
 Give me a break.  You people are crazy.  If you have something worthwhile
 to say i'll listen, but these the sky is falling! arguments are idiotic.
 
   -Matt
 

students != hostile users

Making mistakes is part of learning.

Patrick


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel C. Sobral

Patrick Welche wrote:
 
 students != hostile users

We obviously have known different students... :-)

 Making mistakes is part of learning.

A hostile user is one which will act in a non-friendly manner.
Whether intentionaly or not is irrelevant from the point of view of
the administrator, as far as protecting the system goes.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Sean Witham



Daniel C. Sobral wrote:

  It would be nice to have a way to indicate that, a la SIGDANGER.
 
 Ok, everybody is avoiding this, so I'll comment. Yes, this would be
 interesting, and a good implementation will very probably be
 committed. *BUT*, this is not as useful as it seems. Since the
 correct solution is buy more memory/increase swap (correct solution
 for our target markets, anyway), there is little incentive to
 implement it.
 
 So, I think people who can answer the above is thinking like Well,
 it is useful, but it's not useful enough for me to spend my time on
 it, and I'm sure as hell don't want to write mini-papers on why it's
 not that useful.
 

For those who wish to develop code for safety related systems that is
not good enough. They have to prove that all code can handle the
degradation
of resources gracefully. Such code relies on guaranteed memory
allocations
or in the very least warnings of memory shortage and prioritized
allocations.
So the least important sub-systems die first.

--Sean


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


:
:For those who wish to develop code for safety related systems that is
:not good enough. They have to prove that all code can handle the
:degradation
:of resources gracefully. Such code relies on guaranteed memory
:allocations
:or in the very least warnings of memory shortage and prioritized
:allocations.
:So the least important sub-systems die first.
:
:--Sean

I'm sorry, but when you write code for a safety related system you
do not dynamically allocate memory at all.  It's all essentially static.
There is no issue with the memory resource.  Besides, none of the BSD's are
certified for any of that stuff that I know of.

What's next:  A space shot?  These what-if scenarios are getting
ridiculous.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-16 Thread Garance A Drosihn

At 9:52 PM -0700 7/15/99, Matthew Dillon wrote:
: ...  How many programmers bother to even *clear* errno before
: making these calls (since some system calls do not set errno
:  
: if it already non-zero).  Virtually nobody.
:  ^^^
:
:Erm... WTF?!?! If so, why the HELL are we doing that?!?

No, wait, I got that wrong I think.

Oh yah, I remember now.  Hmm.  How odd.  I came across a case
where read() could return -1 and not set errno properly if
errno was already set, but a perusal of the kernel code seems
to indicate that this can't happen.  Very weird.

For what it's worth, I know I've run into situations where errno
had to be cleared before calling some system routine (but I don't
think it was read, and I am sure it wasn't on freebsd).

As I remember it, it was some case where sysrtn1 called another
system routine (and you would be calling sysrtn1).  If the call
to the inner system routine failed, then the inner routine would
set errno and sysrtn would return it's own error.  However,
sysrtn1 would return the SAME error in some other circumstances,
circumstances which did not set errno.  So, if you wanted to check
errno when you got an error return from sysrtn1, you had to be
sure to zero it out before calling sysrtn1.

This was a lesson taught after a long wild-goose chase trying
to track down the wrong reason for an error-return from sysrtn1
(whatever that routine was...), because we had NOT zeroed out
errno first.

---
Garance Alistair Drosehn   =   g...@eclipse.acs.rpi.edu
Senior Systems Programmer  or  dro...@rpi.edu
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Brownlee

On Fri, 16 Jul 1999, Matthew Dillon wrote:

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's 
 are
 certified for any of that stuff that I know of.
 
 What's next:  A space shot?  These what-if scenarios are getting
 ridiculous.

Well, NetBSD is slated to be used in the 'Space Acceleration
Measurement System II', measuring the microgravity environment on
the International Space Station using a distributed system based
on several NetBSD/i386 boxes.

Sometimes your 'what-if' senarios are others' standard operating
procedures.

David/absolute

   What _is_, what _should be_, and what _could be_ are all distinct.





To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon


:   Well, NetBSD is slated to be used in the 'Space Acceleration
:   Measurement System II', measuring the microgravity environment on
:   the International Space Station using a distributed system based
:   on several NetBSD/i386 boxes.
:
:   Sometimes your 'what-if' senarios are others' standard operating
:   procedures.
:
:   David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.

None of the BSD's (nor NT, nor any other complex general purpose operating
system) are certified for critical systems in space.  The reason is
simple:  None of these operating systems can deal with memory faults 
caused by radiation.  You might see it for internal communications or
non-critical sensing, but you aren't going to see it for external
communications or thruster control.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Alan C. Horn

On Fri, 16 Jul 1999, Matthew Dillon wrote:


:  Well, NetBSD is slated to be used in the 'Space Acceleration
:  Measurement System II', measuring the microgravity environment on
:  the International Space Station using a distributed system based
:  on several NetBSD/i386 boxes.
:
:  Sometimes your 'what-if' senarios are others' standard operating
:  procedures.
:
:  David/absolute
:
:   What _is_, what _should be_, and what _could be_ are all distinct.

Ummm... this doesn't sound like a critical system to me.  It sounds like
an experiment.


It's probably an awfully expensive experiment (putting things into space
is not cheap)

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Daniel Eischen

 I'm sorry, but when you write code for a safety related system you
 do not dynamically allocate memory at all.  It's all essentially static.
 There is no issue with the memory resource.  Besides, none of the BSD's 
 are
 certified for any of that stuff that I know of.

Sometimes it's not feasible to statically allocate memory.  You
dynamically allocate all the memory you need at program initialization 
(and no, we don't want to manage a pool of memory ourselves - that's
what the OS is for).  

Note that languages such as Ada raise exceptions when memory allocation
fails.  The underlying run-time relies on malloc returning null in
order to raise an exception.  Normally, programs written in Ada
take great care to gracefully handle these exceptions.  All the C
programs that we've ever written also take great care in handling
NULL returns from malloc.

I have no problem with overcommit, but I can see the need that
some folks have for turning it off.  If you don't want to write
the code to allow this, that's fine - you don't want/need it,
so why should you?  But if other folks see a need for it, let
_them_ write the hooks for it :-)

Dan Eischen
eisc...@vigrid.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Matthew Dillon

: I'm sorry, but when you write code for a safety related system you
: do not dynamically allocate memory at all.  It's all essentially static.
: There is no issue with the memory resource.  Besides, none of the BSD's 
are
: certified for any of that stuff that I know of.
:
:Sometimes it's not feasible to statically allocate memory.  You
:dynamically allocate all the memory you need at program initialization 
:(and no, we don't want to manage a pool of memory ourselves - that's
:what the OS is for).  
:...
:Note that languages such as Ada raise exceptions when memory allocation
:fails.  The underlying run-time relies on malloc returning null in
:order to raise an exception.  Normally, programs written in Ada

Simply set a resource limit. 

You are making the classic mistake of assuming that a fail-safe in the
O.S. must be integrated all the way down into the user level when, 
in fact, it is simply a matter of setting a resource limit.

When you are running an embedded system and have full control over the
software being run, setting resource limits will do what you want.  By
doing so you are effectively managing the software modules on a 
module-by-module basis and not allowing one module to indirectly effect
another.  This is what you want to do in an embedded system:  You do
not want to create a situation where a failure in one module cascades
into others.

-Matt
Matthew Dillon 
dil...@backplane.com

:take great care to gracefully handle these exceptions.  All the C
:programs that we've ever written also take great care in handling
:NULL returns from malloc.
:
:I have no problem with overcommit, but I can see the need that
:some folks have for turning it off.  If you don't want to write
:the code to allow this, that's fine - you don't want/need it,
:so why should you?  But if other folks see a need for it, let
:_them_ write the hooks for it :-)
:
:Dan Eischen
:eisc...@vigrid.com
:



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread Brian F. Feldman

Can we kill this thread already? This resolves nothing. The only good
to come of this is all of the nice doc-proj input Matt is providing
(and providing well, I might add.)

There is no point that hasn't been rehashed a dozen times over, and
you (the ones who want overcommitting turned off) are not helping
the S/N ratio.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 gr...@freebsd.org   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-16 Thread David Scheidt

On Fri, 16 Jul 1999, Daniel C. Sobral wrote:

 Technical follow-up:
 
 Contrary to what I previously said, a number of tests reveal that
 Solaris, indeed, does not overcommit. All non-read only segments,

Neither does HP/UX 10.x. (Haven't got an 11 box handy to check.) 
The memory allocation process is something like this:
1) reserve is allocated from a swap area.  Preference is given to
swap devices, even if a swap file system has a higher priority.
2) If there is no space on a swap device, swap is allocated from a 
swap filesystem, if one is configured.  If there is nothing to be
allocated in a swap filesystem, the kernel attempts to grow the 
swap file on a filesystem by swchunk (a tunable, default 2MB, I think).
(Swap on filesystems starts at zero or swchunck, and is grown as needed
up to the limit spec'd at swapon(1M) time.)
3) If this fails, either because there is no space on the file system, 
or the swapfile has reached its limit, memory (actual core) is allocated.
The system tunable swapmem_on determines whether memory is used for 
swap reserve or not.  Default is to use it.
4) If there isn't swap to reserve, the request fails, even if none of 
the reserved swap is used.  

The swapinfo(1M) man page makes this quite clear:

  +Requests for more paging space will fail when they cannot be
   satisfied by reserving device, file system, or memory paging,
   even if some of the reserved paging space is not yet in use.
   Thus it is possible for requests for more paging space to be
   denied when some, or even all, of the paging areas show zero
   usage - space in those areas is completely reserved.

The upside of  this is that if you do run out of swap, the kernel doesn't 
kill random processes.  The downside is, I have seen 4GB boxes, with 
plenty of swap, run out with less than a gig of memory actually in use.  
Oh, and if you swap to a filesystem, you can fill it up, without actually
using any of the space.

I don't know which behaviors is more bogus.


David Scheidt



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-16 Thread Patryk Zadarnowski


 At 9:52 PM -0700 7/15/99, Matthew Dillon wrote:
 : ...  How many programmers bother to even *clear* errno before
 : making these calls (since some system calls do not set errno
 :
 : if it already non-zero).  Virtually nobody.
 :  ^^^
 :
 :Erm... WTF?!?! If so, why the HELL are we doing that?!?
 
 No, wait, I got that wrong I think.
 
 Oh yah, I remember now.  Hmm.  How odd.  I came across a case
 where read() could return -1 and not set errno properly if
 errno was already set, but a perusal of the kernel code seems
 to indicate that this can't happen.  Very weird.
 
 For what it's worth, I know I've run into situations where errno
 had to be cleared before calling some system routine (but I don't
 think it was read, and I am sure it wasn't on freebsd).

Ahem, I'm not sure what that's got to do with swap overcommit, but
anything to distract this thread is a good thing ;)

The correct thing to do with errno is to clear it before a call IFF
you are going to check its value on return from the call, simply
because the calls NEVER don't clear errno on success, but set/change
it on error.  Every standard I've seen requires this behaviour quite
explicitely, and I'm preaty sure it's documented someone in BSD man
pages too. It's definitely correct if you look at the syscall stub
code in libc.

And yes, almost all the code I've seen does the right thing when
it comes to handling errno, including checking its value on an error
from system call (ususally by calling warn() or err()), so the
``Virtually nobody'' argument above is rather misguided.

If something in libc READS errno without clearing it (other than
err/warr functions that is ;), it's badly broken and should be fixed
in the library, not in the user code. IMHO.

patryk.



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Michael Schuster - TSC SunOS Germany


Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread Daniel C. Sobral


[EMAIL PROTECTED] wrote:
 
 All of the arguments I've seen so far assume that one process is
 running off and grabbing all the available memory. That may be
 the most likely scenario, but it's most certainly not the *only*
 scenario. What if you have a whole bunch of "middle sized" processes
 running, all using memory efficiently, but in total using 95% of
 the available VM. A malloc(5*1024*1024) might work, but I need
 10 MB instead of 5MB. And my memory footprint is just a little
 bit bigger than the other guys. Instead of returning NULL to
 the malloc() request, *zap* I'm dead. How can you possibly
 call that sensible behaviour?

No process is killed at malloc() time. A process is killed when
(another) process needs more memory and it is not available.

 Yes, the machine is under-resourced. I can't help that -- it's not my
 machine. The machine belongs to a customer who happens to run my IMAP
 software, who also happens to have ignored our sizing guidelines. In
 this situation I have no choice but to deal with the low memory
 condition, and our code does that, if it's given the chance! At
 least give me the opportunity to deal with the situation gracefully.

If it was not for overcommit, you wouldn't be running half of what
you are running in that machine in first place. So, overcommit is
helping you run much more for the same resources.

 What if we decided to defer errors from bind just because there
 weren't any mbufs available, and later killed the process when it
 tried to do network I/O? People would howl bloody murder! (== this is
 rhetorical, folks)

Out of mbufs does not result in system deadlock, out of memory does.

 The semantics of malloc() have been defined since almost the dawn of
 time. From the current manpage:
 
   RETURN VALUES
  The malloc() and calloc() functions return a pointer to the allocated
  memory if successful; otherwise a NULL pointer is returned.
 
 Nowhere does it say that allocated memory might not exist. Nowhere
 does it say that I have to touch all the allocated pages to make
 sure they are really there. Nowhere does it say process death at
 some non-deterministic time in the future might be a side effect
 of calling malloc().

And nowhere does it say it does not, of course. But that is beside
the point. malloc() works as specified. It is the behavior of the
system in a low-resource situation that leads to processes being
killed.

 Applications are written assuming that malloc() behaves in the
 documented manner. It is *not* acceptable to tell applications writers

Actually, applications are written assuming that malloc() will not
fail, generally speaking.

 that they have to provide their own management routines on top of malloc()
 (SEGV catchers and the like) if they want the long standing semantics
 of malloc() to be preserved. If the current malloc() cannot behave in
 the documented and expected manner it needs to be renamed, because
 malloc() it most certainly isn't.

It's funny how all these FreeBSD systems manage to gain such a good
reputation despite such an obvious flaw, isn't it? :-)

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn


At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
If 1G isn't enough, spend another $30 and throw 2G of swap
online.  Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.

The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as "just casual questions" seem to be taken as "demands that
something be done, RIGHT NOW".

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my "casual questions" sometime in a different
context...

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda


 On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
 Uh... like any modern unix, Solaris overcommits.

 On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
"Eduardo E. Horvath" [EMAIL PROTECTED] said:

 Where do you guys get this misinformation?  
:
 Note the `19464k reserved'; that space has been reserved but not yet
 allocated.

Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Both Dillon and Sobral mistakenly claimed that "Solaris overcommits",
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon



:"pstat -s" on SunOS4, and "swap -s" on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki


On Wed, 14 Jul 1999, John Nemeth wrote:

 On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
 } Garance A Drosihn wrote:
 }  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
 }   In which case the program that consumed all memory will be killed.
 }   The program killed is +NOT+ the one demanding memory, it's the one
 }   with most of it.
 }  
 }  But that isn't always the best process to have killed off...
 } 
 } Sure it is. :-) Let's see...
 
  This statement is absurd.  Only a comptetant admin can decide
 which process can be killed.  No arbitrary decision is going to be
 correct.
 
 }  It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
"totally disposable, kill whenever you want", and class 1 means "never try
to kill me". Of course, most processes would get some default value, and
superuser could "renice" them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//  [EMAIL PROTECTED] WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small  Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon



:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k available
:
:[there has been a little background activity, but the numbers speak for themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug


 If this is correct, then solaris is using a VMSPACE = SWAPSPACE
 model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


Here is what I get from one of BEST's mail  www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  ---
0x870000x373000 738   738 rwx default   ---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  ---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  ---
0xefbde000 0xefbfe0001414 rwx COW swap  ---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon


 In that scenario, the 512MB of swap I assigned to this machine would be
 dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn




On Thu, 15 Jul 1999 17:53:52 CST, [EMAIL PROTECTED] wrote:

 All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread Daniel C. Sobral


Andrew Reilly wrote:
 
 On Thu, Jul 15, 1999 at 11:48:41PM +0900, Daniel C. Sobral wrote:
  Actually, applications are written assuming that malloc() will not
  fail, generally speaking.
 
 Is this really the case?  I'm pretty sure I've _never_ ignored the
 possibility of a NULL return from malloc, and I've been using it
 for nearly 20 years.  I usually print a message and exit, but I
 never ignore it.  I thought that was pretty standard practise.

You are always free to inspect how applications deal with malloc(),
as far as open source software goes. Anyway, your "usual" behavior
is to expect malloc() will not fail. To print a message and exit is
to treat it as a fatal error, don't you agree?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green


   
All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted "wanting no overcommit" are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread Matthew Dillon


: fail, generally speaking.
:
:Is this really the case?  I'm pretty sure I've _never_ ignored the
:possibility of a NULL return from malloc, and I've been using it
:for nearly 20 years.  I usually print a message and exit, but I
:never ignore it.  I thought that was pretty standard practise.
:
:This is just a random comment, orthogonal to the overcommit issue,
:but I've seen both you and Matthew say this now, and I was surprised
:both times.
:
:-- 
:Andrew

The are dozens of libc routines which call malloc internally and return 
allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
and so forth.  Dozens.  And while we might check some of these for NULL, 
we don't check them all, and the ones we do check we tend to conclude
a failure other then a memory failure.  We would assume that the directory
or file does not exist, for example.  How many programmers check errno 
after such a failure?  Very few.  How many programmers bother to even
*clear* errno before making these calls (since some system calls do not
set errno if it already non-zero).  Virtually nobody.

Having malloc() return NULL due to some *unrelated* process running the
system out of swap is unnacceptable as it would result in serious 
instability to a great many programs that were never designed to deal
with the case.  Rather then crying about the system killing your favorite
process, you would be crying about half a dozen processes that are still
running no longer being stable.  As a sysop, I would reboot a system
in such a state instantly.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon



: In that scenario, the 512MB of swap I assigned to this machine would be
: dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread Brian F. Feldman


On Thu, 15 Jul 1999, Matthew Dillon wrote:

 
 The are dozens of libc routines which call malloc internally and return 
 allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
 and so forth.  Dozens.  And while we might check some of these for NULL, 
 we don't check them all, and the ones we do check we tend to conclude
 a failure other then a memory failure.  We would assume that the directory
 or file does not exist, for example.  How many programmers check errno 
 after such a failure?  Very few.  How many programmers bother to even
 *clear* errno before making these calls (since some system calls do not
   ^^
We're not supposed to have to clear errno unless we have to explicitly
test if it has changed. We're not supposed to clear it before any system
call which could possibly fail and set errno.

 set errno if it already non-zero).  Virtually nobody.
  
Erm... WTF?!?! If so, why the HELL are we doing that?!?

 
   -Matt
   Matthew Dillon 
   [EMAIL PROTECTED]
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message
 

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread Matthew Dillon


: The are dozens of libc routines which call malloc internally and return 
: allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
: and so forth.  Dozens.  And while we might check some of these for NULL, 
: we don't check them all, and the ones we do check we tend to conclude
: a failure other then a memory failure.  We would assume that the directory
: or file does not exist, for example.  How many programmers check errno 
: after such a failure?  Very few.  How many programmers bother to even
: *clear* errno before making these calls (since some system calls do not
:  ^^
:We're not supposed to have to clear errno unless we have to explicitly
:test if it has changed. We're not supposed to clear it before any system
:call which could possibly fail and set errno.
:
: set errno if it already non-zero).  Virtually nobody.
:  
:Erm... WTF?!?! If so, why the HELL are we doing that?!?

No, wait, I got that wrong I think.

Oh yah, I remember now.  Hmm.  How odd.  I came across a case where
read() could return -1 and not set errno properly if errno
was already set, but a perusal of the kernel code seems to indicate
that this can't happen.  Very weird.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon



:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:[EMAIL PROTECTED]

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-15 Thread David E. Cross


  No, wait, I got that wrong I think.
  
  Oh yah, I remember now.  Hmm.  How odd.  I came across a case where
  read() could return -1 and not set errno properly if errno
  was already set, but a perusal of the kernel code seems to indicate
  that this can't happen.  Very weird.
  
 
 I thought I saw this somewhere too, but I thought it was more of a case that
 it was somewhere *inside* read that errno had to be preserved. i.e. errno
 gets set somewhere at the top of the code, and if it was already set at a
 certain point, failure was expected, and to pass along the original errno,
 not the new one.
 
 Or perhaps we're sharing a hallucination. :)
Well, set/getpriority(2), certainly can return "-1"  and not be an error.
You would need to clear out errno before that call and check it on return.

This is where excpetions would be a great gain.  It could also be used to
force programmers to check their system calls more closely.  Oops, you didn't
handle excpetion foo?  SIGBADPROGRAMMER.
--
David Cross   | email: [EMAIL PROTECTED] 
Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd 
Rensselaer Polytechnic Institute, | Ph: 518.276.2860
Department of Computer Science| Fax: 518.276.4033
I speak only for myself.  | WinNT:Linux::Linux:FreeBSD


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Michael Schuster - TSC SunOS Germany

Hi everyone,

I've been following this discussion almost from the beginning, and I
have the feeling that we're not _really_ getting very far. There's good
arguments for and against overcommit, depending on your point of view
and your requirements.

What I do see is a not-so-openly voiced consent that the way
resource(sp?) shortages are handled in an overcommitting system
(SIGKILL) makes some of us rather unhappy. I therefore suggest those of
us who would like to see a change in this area pool their efforts and
energies to work on a mechanism that handles resource shortage in a more
graceful way.

cheerio
Michael
-- 
michael.schus...@germany.sun.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Daniel C. Sobral

Danny Thomas wrote:
 
 Killing the biggest is simple to implement and usually right.
 ... but some people don't want that policy, at least on some of their
 systems. Does FreeBSD offer alternatives? Is so, they've been conspicuously
 absent from discussion, which might have taken things into a more
 productive vein. What do other over-committing systems offer?

Absent, eh? FreeBSD offers soft and hard limits on resources.

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Daniel C. Sobral

lyn...@orthanc.ab.ca wrote:
 
 All of the arguments I've seen so far assume that one process is
 running off and grabbing all the available memory. That may be
 the most likely scenario, but it's most certainly not the *only*
 scenario. What if you have a whole bunch of middle sized processes
 running, all using memory efficiently, but in total using 95% of
 the available VM. A malloc(5*1024*1024) might work, but I need
 10 MB instead of 5MB. And my memory footprint is just a little
 bit bigger than the other guys. Instead of returning NULL to
 the malloc() request, *zap* I'm dead. How can you possibly
 call that sensible behaviour?

No process is killed at malloc() time. A process is killed when
(another) process needs more memory and it is not available.

 Yes, the machine is under-resourced. I can't help that -- it's not my
 machine. The machine belongs to a customer who happens to run my IMAP
 software, who also happens to have ignored our sizing guidelines. In
 this situation I have no choice but to deal with the low memory
 condition, and our code does that, if it's given the chance! At
 least give me the opportunity to deal with the situation gracefully.

If it was not for overcommit, you wouldn't be running half of what
you are running in that machine in first place. So, overcommit is
helping you run much more for the same resources.

 What if we decided to defer errors from bind just because there
 weren't any mbufs available, and later killed the process when it
 tried to do network I/O? People would howl bloody murder! (== this is
 rhetorical, folks)

Out of mbufs does not result in system deadlock, out of memory does.

 The semantics of malloc() have been defined since almost the dawn of
 time. From the current manpage:
 
   RETURN VALUES
  The malloc() and calloc() functions return a pointer to the allocated
  memory if successful; otherwise a NULL pointer is returned.
 
 Nowhere does it say that allocated memory might not exist. Nowhere
 does it say that I have to touch all the allocated pages to make
 sure they are really there. Nowhere does it say process death at
 some non-deterministic time in the future might be a side effect
 of calling malloc().

And nowhere does it say it does not, of course. But that is beside
the point. malloc() works as specified. It is the behavior of the
system in a low-resource situation that leads to processes being
killed.

 Applications are written assuming that malloc() behaves in the
 documented manner. It is *not* acceptable to tell applications writers

Actually, applications are written assuming that malloc() will not
fail, generally speaking.

 that they have to provide their own management routines on top of malloc()
 (SEGV catchers and the like) if they want the long standing semantics
 of malloc() to be preserved. If the current malloc() cannot behave in
 the documented and expected manner it needs to be renamed, because
 malloc() it most certainly isn't.

It's funny how all these FreeBSD systems manage to gain such a good
reputation despite such an obvious flaw, isn't it? :-)

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Garance A Drosihn

At 6:29 PM -0700 7/14/99, Matthew Dillon wrote:
If 1G isn't enough, spend another $30 and throw 2G of swap
online.  Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.

The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability.

Please note that we're talking at cross-purposes here, mainly
because I didn't realize this same general topic was being
beaten to death in the 'replacement for grep' thread (which I
have not been following).

Speaking for just me myself and I, I have no problems with the
current overcommit model.  All I'd like to do is have a way to
indicate which processes should not get booted first, if the
system does indeed run out of swap and needs to boot some
processes.  However, other people seem much more worked up
about this topic than I am, and thus what I (personally) meant
as just casual questions seem to be taken as demands that
something be done, RIGHT NOW.

I now realize that some people are arguing that malloc should
return an error if the system runs out of space, but that's not
what I am thinking about.

So, I think I'll bow out of this discussion for now, and maybe
try to discuss my casual questions sometime in a different
context...

---
Garance Alistair Drosehn   =   g...@eclipse.acs.rpi.edu
Senior Systems Programmer  or  dro...@rpi.edu
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda

 On Thu, 15 Jul 1999, Daniel C. Sobral wrote:
 Uh... like any modern unix, Solaris overcommits.

 On Thu, 15 Jul 1999 08:46:36 -0700 (PDT),
Eduardo E. Horvath e...@one-o.com said:

 Where do you guys get this misinformation?  
:
 Note the `19464k reserved'; that space has been reserved but not yet
 allocated.

Both Dillon and Sobral mistakenly claimed that Solaris overcommits,
this fact seems to be somewhat suggestive.

And also, the followings are allocated memory and reserved memory 
in my environment. (This table also includes Eduardo's example)

SunOS   allocated reservedtotal total/allocated
-   -   
4.1.4   4268k1248k5516k 1.2924  
4.1.2   7732k1492k9224k 1.193   
4.1.4   8848k3080k   11928k 1.3481  
4.1.4  13532k6772k   20304k 1.5004  
5.5.1  15312k5092k   20404k 1.3325  
4.1.3  16112k6512k   22624k 1.4042  
4.1.2  26356k1620k   27976k 1.0615  
4.1.4  26560k3756k   30316k 1.1414  
5.526076k   11348k   37424k 1.4352  
4.1.4  32984k5556k   38540k 1.1684  
5.632448k7072k   39520k 1.2179  
4.1.4  38056k3692k   41748k 1.097   
4.1.4  49064k7672k   56736k 1.1564  
4.1.4  67012k7800k   74812k 1.1164  
4.1.4  99348k   16956k  116304k 1.1707  
4.1.4 118288k   11780k  130068k 1.0996  
5.6   231968k   18880k  250848k 1.0814  
5.7   307240k   19464k  326704k 1.0634  

(sorted by total amount of used swap)

In those examples, non-overcommiting system requires 1.06x ... 1.50x
more swap space than overcommiting system.  This table also indicates
that in proportion as total used swap increase the ratio will
decrease. And extra swap space required on non-overcommiting system is
approximately several tens mega bytes. i.e. The extra cost of
non-overcommiting system is less than ten dollers in my environment.

Matt Dillon claimed that non-overcommiting system requires 8x or more
swap space than overcommiting system. That's just wrong as above.
(There might be cases which requires 8x swap, but it is not typical
 like Dillon said.)

If you don't want non-overcommiting system, because you don't want to
pay it's cost. That's OK, but please don't force us to accept your
limited view.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

:Both Dillon and Sobral mistakenly claimed that Solaris overcommits,
:this fact seems to be somewhat suggestive.
:
:And also, the followings are allocated memory and reserved memory 
:in my environment. (This table also includes Eduardo's example)
:
:   SunOS   allocated reservedtotal total/allocated
:   -   -   
:   4.1.4   4268k1248k5516k 1.2924  
:   4.1.2   7732k1492k9224k 1.193   
:   4.1.4   8848k3080k   11928k 1.3481  
:   4.1.4  13532k6772k   20304k 1.5004  
:   5.5.1  15312k5092k   20404k 1.3325  
:   4.1.3  16112k6512k   22624k 1.4042  
:   4.1.2  26356k1620k   27976k 1.0615  
:   4.1.4  26560k3756k   30316k 1.1414  
:   5.526076k   11348k   37424k 1.4352  
:   4.1.4  32984k5556k   38540k 1.1684  
:   5.632448k7072k   39520k 1.2179  
:   4.1.4  38056k3692k   41748k 1.097   
:   4.1.4  49064k7672k   56736k 1.1564  
:   4.1.4  67012k7800k   74812k 1.1164  
:   4.1.4  99348k   16956k  116304k 1.1707  
:   4.1.4 118288k   11780k  130068k 1.0996  
:   5.6   231968k   18880k  250848k 1.0814  
:   5.7   307240k   19464k  326704k 1.0634  
:
:   (sorted by total amount of used swap)
:
:In those examples, non-overcommiting system requires 1.06x ... 1.50x
:...
:soda

Umm... how are you getting the reserved numbers?  Are you
sure that isn't simply cached swap blocks?  I.E. when something
gets swapped out and then is swapped back in and dirtied,
Solaris may be holding the swap block assignment rather
then letting it go.  FreeBSD-stable does the same thing.
FreeBSD-current does not -- it lets it go in order to be
able to reallocate it later as part of a contiguous swath
for performance reasons.

These 'extra' swap blocks are effectively reserved but not
actually allocated.  They can be reassigned.  The numbers
above are very similar to what you would see in a
redirtying-cache swap block situation on a FreeBSD-stable
system.

If I add up all the unshared writeable segments on my
home box - that is, all segments for which one would 
potentially have to reserve swap space - I get a total
of around 382MB.  The machine is currently eating around
100MB of ram and 5MB of swap, or around a 3.5:1 ratio
in this case.  A non-overcommit model would have to 
reserve swap space for 382MB - 100MB = 282MB verses the
5MB of swap the machine actually allocates.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Noriyuki Soda

 On Thu, 15 Jul 1999 11:09:01 -0700 (PDT),
Matthew Dillon dil...@apollo.backplane.com said:

 Umm... how are you getting the reserved numbers? 

pstat -s on SunOS4, and swap -s on SunOS5. From Solaris man page:

:-s Print summary information  about  total  swap
:   space usage and availability:
:
:  allocated   The total amount of swap space
:  (in  1024-byte blocks)
:  currently allocated for use as
:  backing store.
:
:  reservedThe total amount of swap space
:  (in   1024-bytes  blocks)  not
:  currentlyallocated,but
:  claimed by memory mappings for
:  possible future use.
:
:  usedThe total amount of swap space
:  (in  1024-byte blocks) that is
:  either allocated or reserved.
--
soda


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:pstat -s on SunOS4, and swap -s on SunOS5. From Solaris man page:
:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

Yah, that's what I thought.  A solaris expert could tell us
for sure but I am pretty sure those are simply cached swap
blocks after-the-fact, not actual reservations on potentially
swappable space.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Andrzej Bialecki

On Wed, 14 Jul 1999, John Nemeth wrote:

 On Jul 15,  2:40am, Daniel C. Sobral wrote:
 } Garance A Drosihn wrote:
 }  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
 }   In which case the program that consumed all memory will be killed.
 }   The program killed is +NOT+ the one demanding memory, it's the one
 }   with most of it.
 }  
 }  But that isn't always the best process to have killed off...
 } 
 } Sure it is. :-) Let's see...
 
  This statement is absurd.  Only a comptetant admin can decide
 which process can be killed.  No arbitrary decision is going to be
 correct.
 
 }  It would be nice to have a way to indicate that, a la SIGDANGER.

How about assigning something like a class to process, which gives VM
 a hint which processes should be killed first without much thinking, and
which the last (or never)? In other words, let's say class 10 means
totally disposable, kill whenever you want, and class 1 means never try
to kill me. Of course, most processes would get some default value, and
superuser could renice them to more resistant class.

This way both sides of the discussion would be satisfied :-)

Andrzej Bialecki

//  ab...@webgiro.com WebGiro AB, Sweden (http://www.webgiro.com)
// ---
// -- FreeBSD: The Power to Serve. http://www.freebsd.org 
// --- Small  Embedded FreeBSD: http://www.freebsd.org/~picobsd/ 



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Before program start:
:total: 2k bytes allocated + 4792k reserved = 24792k used, 191048k available
:
:After malloc, before touch:
:total: 18756k bytes allocated + 37500k reserved = 56256k used, 159580k 
available
:
:After malloc + touch:
:total: 52804k bytes allocated + 4852k reserved = 57656k used, 158184k available
:
:After fork:
:total: 52928k bytes allocated + 37644k reserved = 90572k used, 125264k 
available
:
:[there has been a little background activity, but the numbers speak for 
themselves]
:
:
:Daniel

Assuming the allocated field is not inclusive of real
memory, what we have is swap reservation under solaris
for clean pages, and allocation and assignment for dirty
pages.  The grand total will tell you the total VM potential
for malloc'd space but does not appear to tell you how 
much swap is actually active - i.e. was written to and 
contains valid data.

It would be interesting to see if the stack segment is
included in the reservation.  Try setting the stack resource
limit to 32m and run the same program, except without
bothering to malloc() or touch anything.  See if the
stack segment is included in the reservation field.

It would also be interesting to see how solaris deals
with MAP_PRIVATE mmap's.

If this is correct, then solaris is using a VMSPACE = SWAPSPACE
model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Jonathan Lemon

In article 
local.mail.freebsd-hackers/199907151825.laa11...@apollo.backplane.com you 
write:
::-s Print summary information  about  total  swap
::   space usage and availability:
::
::  allocated   The total amount of swap space
::  (in  1024-byte blocks)
::  currently allocated for use as
::  backing store.
::
::  reservedThe total amount of swap space
::  (in   1024-bytes  blocks)  not
::  currentlyallocated,but
::  claimed by memory mappings for
::  possible future use.
::
::  usedThe total amount of swap space
::  (in  1024-byte blocks) that is
::  either allocated or reserved.
:--
:soda

It would be really easy to test this.

Write a program that malloc's 32MB of space and touches it,
then sleeps 10 seconds and forks, with both child and parent
sleeping afterwords.  ( the parent and the forked child should
not touch the memory after the fork occurs ).

Do a pstat -s before, after the initial touch, and after
the fork.  If you do not see the reserved swap space jump
by 32MB after the fork, it isn't what you thought it was.

aladdin[5:32pm] prtconf
System Configuration:  Sun Microsystems  i86pc
Memory size: 128 Megabytes

aladdin[5:41pm] uname -a
SunOS aladdin 5.6 Generic_105182-14 i86pc i386


total: 67280k bytes allocated + 28668k reserved = 95948k used, 196460k avail
malloced 32MB...
total: 67320k bytes allocated + 61460k reserved = 128780k used, 163592k avail
touched...
total: 100084k bytes allocated + 28696k reserved = 128780k used, 163732k avail
forking...
total: 100092k bytes allocated + 61520k reserved = 161612k used, 130864k avail
touching again (parent)...
touching again (child)...
total: 132864k bytes allocated + 28748k reserved = 161612k used, 130760k avail
exiting...
exiting...
total: 67248k bytes allocated + 28700k reserved = 95948k used, 196448k avail

--
Jonathan


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread sthaug

 If this is correct, then solaris is using a VMSPACE = SWAPSPACE
 model.  FreeBSD uses a VMSPACE = SWAPSPACE + REALMEM model.

AFAIK it has been stated quite explicitly by the Solaris folks that
Solaris 2.x uses VMSPACE = SWAPSPACE + REALMEM. This is *different*
from SunOS 4.1.x.

Steinar Haug, Nethelp consulting, sth...@nethelp.no


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon

Here is what I get from one of BEST's mail  www proxy machines.
~dillon/br adds the object size's together.  'swap' and 'default'
objects refers to unbacked VM objects - and none of the processes running
fork shared unbacked objects so we don't have to worry about that.  The 
'swap' designation means that at least one page in the object has been
assigned swap.  The default designation means that no pages have been 
assigned swap.  The pages can be dirty or clean.

Typical /proc/PID/map output looks like this (taken from one of the
sendmail processes).  The lines I've marked are the ones being counted
as unbacked/swap-backed VM.  The rest are vnode-backed and not counted.

0x1000 0x4b000   66 0 r-x COW vnode
0x4b0000x4e0003 3 rwx COW vnode
0x4e0000x87000   5343 rwx COW swap  ---
0x870000x373000 738   738 rwx default   ---
0x2004b000 0x2005a000 2 0 r-x COW vnode
0x2005a000 0x2005c000 2 0 rwx COW vnode
0x2005c000 0x20065000 6 2 rwx COW swap  ---
0x20068000 0x2006d000 3 0 r-x COW vnode
0x2006d000 0x2006e000 1 1 rwx COW vnode
0x2006e000 0x200cc00070 0 r-x COW vnode
0x200cc000 0x200d 4 4 rwx COW vnode
0x200d 0x200e7000 8 6 rwx COW swap  ---
0xefbde000 0xefbfe0001414 rwx COW swap  ---

proxy1:/tmp# cat /proc/*/map | egrep 'swap|default' | ~dillon/br
639168K

proxy1:/tmp# pstat -s
Device  1K-blocks UsedAvail Capacity  Type
/dev/sd0b  52428812596   511628 2%Interleaved

This machine has 256MB of ram of which around 200MB is in use, we
will assume the entire 200MB is used by VM spaces for processes.  It is 
an active machine with around 205 processes at the time of the test.

So.  200MB of ram + 12MB of swap = 212MB of actual storage being used
out of 639MB of total swap-backable VM.

About a factor of 3.2:1.  Actual swap utilization is sitting at 2%.
If no overcommit were allowed, and assuming a VMSPACE = REALMEM + SWAP
model, 200MB of ram would be active and 439MB worth of swap would be 
either allocated or reserved ( though only 12MB would be actually written,
that part doesn't change ).  439MB of swap verses 12MB of swap.

In that scenario, the 512MB of swap I assigned to this machine would be
dangerously low.

-Matt
Matthew Dillon 
dil...@backplane.com



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon

 In that scenario, the 512MB of swap I assigned to this machine would be
 dangerously low.

With 13GB disks available for a couple of hundred bucks, my machines aren't
going to run out of swap space any time soon, even if I commit to disk.

All I want for Christmas is a knob to disable overcommit.

--lyndon


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Sheldon Hearn



On Thu, 15 Jul 1999 17:53:52 CST, lyn...@orthanc.ab.ca wrote:

 All I want for Christmas is a knob to disable overcommit.

And what I'm pretty sure the majority of the readers on this list want
is for those of you who really think it's necessary to do it yourselves.

What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
wonder whether that's significant...

Ciao,
Sheldon.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Andrew Reilly

On Thu, Jul 15, 1999 at 11:48:41PM +0900, Daniel C. Sobral wrote:
 Actually, applications are written assuming that malloc() will not
 fail, generally speaking.

Is this really the case?  I'm pretty sure I've _never_ ignored the
possibility of a NULL return from malloc, and I've been using it
for nearly 20 years.  I usually print a message and exit, but I
never ignore it.  I thought that was pretty standard practise.

This is just a random comment, orthogonal to the overcommit issue,
but I've seen both you and Matthew say this now, and I was surprised
both times.

-- 
Andrew


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Daniel C. Sobral

Andrew Reilly wrote:
 
 On Thu, Jul 15, 1999 at 11:48:41PM +0900, Daniel C. Sobral wrote:
  Actually, applications are written assuming that malloc() will not
  fail, generally speaking.
 
 Is this really the case?  I'm pretty sure I've _never_ ignored the
 possibility of a NULL return from malloc, and I've been using it
 for nearly 20 years.  I usually print a message and exit, but I
 never ignore it.  I thought that was pretty standard practise.

You are always free to inspect how applications deal with malloc(),
as far as open source software goes. Anyway, your usual behavior
is to expect malloc() will not fail. To print a message and exit is
to treat it as a fatal error, don't you agree?

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread matthew green

   
All I want for Christmas is a knob to disable overcommit.
   
   And what I'm pretty sure the majority of the readers on this list want
   is for those of you who really think it's necessary to do it yourselves.
   
   What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
   wonder whether that's significant...


that's an impressively bold statement to make.  by my reconning, at
least 4 people who have posted wanting no overcommit are more than
capable of programming this for NetBSD.


.mrg.


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread lyndon

 And what I'm pretty sure the majority of the readers on this list want
 is for those of you who really think it's necessary to do it yourselves.
 
 What? Nobody who wants to disable the policy knows how to do it? Hmmm, I
 wonder whether that's significant...

Sheldon, if you can't contribute something useful, then shut up.

If I have to do it myself, I will.



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Matthew Dillon

: fail, generally speaking.
:
:Is this really the case?  I'm pretty sure I've _never_ ignored the
:possibility of a NULL return from malloc, and I've been using it
:for nearly 20 years.  I usually print a message and exit, but I
:never ignore it.  I thought that was pretty standard practise.
:
:This is just a random comment, orthogonal to the overcommit issue,
:but I've seen both you and Matthew say this now, and I was surprised
:both times.
:
:-- 
:Andrew

The are dozens of libc routines which call malloc internally and return 
allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
and so forth.  Dozens.  And while we might check some of these for NULL, 
we don't check them all, and the ones we do check we tend to conclude
a failure other then a memory failure.  We would assume that the directory
or file does not exist, for example.  How many programmers check errno 
after such a failure?  Very few.  How many programmers bother to even
*clear* errno before making these calls (since some system calls do not
set errno if it already non-zero).  Virtually nobody.

Having malloc() return NULL due to some *unrelated* process running the
system out of swap is unnacceptable as it would result in serious 
instability to a great many programs that were never designed to deal
with the case.  Rather then crying about the system killing your favorite
process, you would be crying about half a dozen processes that are still
running no longer being stable.  As a sysop, I would reboot a system
in such a state instantly.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


: In that scenario, the 512MB of swap I assigned to this machine would be
: dangerously low.
:
:With 13GB disks available for a couple of hundred bucks, my machines aren't
:going to run out of swap space any time soon, even if I commit to disk.
:
:All I want for Christmas is a knob to disable overcommit.
:
:--lyndon

If your machines aren't going to run out of swap, then the overcommit 
isn't going to hurt you in a million years.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Brian F. Feldman

On Thu, 15 Jul 1999, Matthew Dillon wrote:

 
 The are dozens of libc routines which call malloc internally and return 
 allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
 and so forth.  Dozens.  And while we might check some of these for NULL, 
 we don't check them all, and the ones we do check we tend to conclude
 a failure other then a memory failure.  We would assume that the directory
 or file does not exist, for example.  How many programmers check errno 
 after such a failure?  Very few.  How many programmers bother to even
 *clear* errno before making these calls (since some system calls do not
   ^^
We're not supposed to have to clear errno unless we have to explicitly
test if it has changed. We're not supposed to clear it before any system
call which could possibly fail and set errno.

 set errno if it already non-zero).  Virtually nobody.
  
Erm... WTF?!?! If so, why the HELL are we doing that?!?

 
   -Matt
   Matthew Dillon 
   dil...@backplane.com
 
 
 To Unsubscribe: send mail to majord...@freebsd.org
 with unsubscribe freebsd-hackers in the body of the message
 

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 gr...@freebsd.org   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Daniel C. Sobral

Technical follow-up:

Contrary to what I previously said, a number of tests reveal that
Solaris, indeed, does not overcommit. All non-read only segments,
and all malloc()ed memory is reserved upon exec() or fork(), and the
reserved memory is not allowed to exceed the total memory. It makes
extensive use of read only DATA segments, and has a NON_RESERVE
mmap() flag.

Though the foot firmly planted in my mouth ought to prevent me from
saying anything else, I must say that it does explain a few things
to me...

--
Daniel C. Sobral(8-DCS)
d...@newsguy.com
d...@freebsd.org

Would you like to go out with me?
I'd love to.
Oh, well, n... err... would you?... ahh... huh... what do I do
next?



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Matthew Dillon

: The are dozens of libc routines which call malloc internally and return 
: allocated storage.  strdup(), opendir(), fopen(), setvbuf(), asprintf(),
: and so forth.  Dozens.  And while we might check some of these for NULL, 
: we don't check them all, and the ones we do check we tend to conclude
: a failure other then a memory failure.  We would assume that the 
directory
: or file does not exist, for example.  How many programmers check errno 
: after such a failure?  Very few.  How many programmers bother to even
: *clear* errno before making these calls (since some system calls do not
:  ^^
:We're not supposed to have to clear errno unless we have to explicitly
:test if it has changed. We're not supposed to clear it before any system
:call which could possibly fail and set errno.
:
: set errno if it already non-zero).  Virtually nobody.
:  
:Erm... WTF?!?! If so, why the HELL are we doing that?!?

No, wait, I got that wrong I think.

Oh yah, I remember now.  Hmm.  How odd.  I came across a case where
read() could return -1 and not set errno properly if errno
was already set, but a perusal of the kernel code seems to indicate
that this can't happen.  Very weird.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-15 Thread Matthew Dillon


:Technical follow-up:
:
:Contrary to what I previously said, a number of tests reveal that
:Solaris, indeed, does not overcommit. All non-read only segments,
:and all malloc()ed memory is reserved upon exec() or fork(), and the
:reserved memory is not allowed to exceed the total memory. It makes
:extensive use of read only DATA segments, and has a NON_RESERVE
:mmap() flag.
:
:Though the foot firmly planted in my mouth ought to prevent me from
:saying anything else, I must say that it does explain a few things
:to me...
:
:--
:Daniel C. Sobral   (8-DCS)
:d...@newsguy.com

Something is weird here.  If the solaris people are using a 
SWAPSIZE + REALMEM VM model, they have to allow the 
allocated + reserved space go +REALMEM bytes over available swap 
space.  If not they are using only a SWAPSIZE VM model.

Wait - does Solaris normally use swap files or swap partitions?
Or is it that weird /tmp filesystem stuff?  If it normally uses swap 
files and allows holes then that explains everything.

-Matt
Matthew Dillon 
dil...@backplane.com


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread Kevin Day

 : The are dozens of libc routines which call malloc internally and 
 return 
 : allocated storage.  strdup(), opendir(), fopen(), setvbuf(), 
 asprintf(),
 : and so forth.  Dozens.  And while we might check some of these for 
 NULL, 
 : we don't check them all, and the ones we do check we tend to conclude
 : a failure other then a memory failure.  We would assume that the 
 directory
 : or file does not exist, for example.  How many programmers check errno 
 : after such a failure?  Very few.  How many programmers bother to even
 : *clear* errno before making these calls (since some system calls do not
 :^^
 :We're not supposed to have to clear errno unless we have to explicitly
 :test if it has changed. We're not supposed to clear it before any system
 :call which could possibly fail and set errno.
 :
 : set errno if it already non-zero).  Virtually nobody.
 :  
 :Erm... WTF?!?! If so, why the HELL are we doing that?!?
 
 No, wait, I got that wrong I think.
 
 Oh yah, I remember now.  Hmm.  How odd.  I came across a case where
 read() could return -1 and not set errno properly if errno
 was already set, but a perusal of the kernel code seems to indicate
 that this can't happen.  Very weird.
 

I thought I saw this somewhere too, but I thought it was more of a case that
it was somewhere *inside* read that errno had to be preserved. i.e. errno
gets set somewhere at the top of the code, and if it was already set at a
certain point, failure was expected, and to pass along the original errno,
not the new one.

Or perhaps we're sharing a hallucination. :)

Kevin


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit

1999-07-15 Thread David E. Cross

  No, wait, I got that wrong I think.
  
  Oh yah, I remember now.  Hmm.  How odd.  I came across a case where
  read() could return -1 and not set errno properly if errno
  was already set, but a perusal of the kernel code seems to indicate
  that this can't happen.  Very weird.
  
 
 I thought I saw this somewhere too, but I thought it was more of a case that
 it was somewhere *inside* read that errno had to be preserved. i.e. errno
 gets set somewhere at the top of the code, and if it was already set at a
 certain point, failure was expected, and to pass along the original errno,
 not the new one.
 
 Or perhaps we're sharing a hallucination. :)
Well, set/getpriority(2), certainly can return -1  and not be an error.
You would need to clear out errno before that call and check it on return.

This is where excpetions would be a great gain.  It could also be used to
force programmers to check their system calls more closely.  Oops, you didn't
handle excpetion foo?  SIGBADPROGRAMMER.
--
David Cross   | email: cro...@cs.rpi.edu 
Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd 
Rensselaer Polytechnic Institute, | Ph: 518.276.2860
Department of Computer Science| Fax: 518.276.4033
I speak only for myself.  | WinNT:Linux::Linux:FreeBSD


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Brian F. Feldman


On Thu, 15 Jul 1999, Daniel C. Sobral wrote:

 "Charles M. Hannum" wrote:
  
  That's also objectively false.  Most such environments I've had
  experience with are, in fact, multi-user systems.  As you've pointed
  out yourself, there is no combination of resource limits and whatnot
  that are guaranteed to prevent `crashing' a multi-user system due to
  overcommit.  My simulation should not be axed because of a bug in
  someone else's program.  (This is also not hypothetical.  There was a
  bug in one version of bash that caused it to consume all the memory it
  could and then fall over.)
 
 In which case the program that consumed all memory will be killed.
 The program killed is +NOT+ the one demanding memory, it's the one
 with most of it.

So why don't we do something else: when we're down to a certain amount of
backing store, start collecting statistics. When we're out, we check the
statistics and find what process has been allocating most of it. We kill
that process.

 
 --
 Daniel C. Sobral  (8-DCS)
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
   "Would you like to go out with me?"
   "I'd love to."
   "Oh, well, n... err... would you?... ahh... huh... what do I do
 next?"
 

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Garance A Drosihn


At 12:00 PM -0400 7/14/99, Brian F. Feldman wrote:
 So why don't we do something else: when we're down to a certain
 amount of backing store, start collecting statistics. When we're
 out, we check the statistics and find what process has been
 allocating most of it. We kill that process.

Not that I'm really commenting on the above idea (although it does
sound fine to me), this reminds me about an earlier thread.  Is there
any interest in us (BSD's) having a SIGDANGER signal like some other
OS's do?  That way, key processes (like sshd) could at least make it
less likely that THEY are the process which is killed.

---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


"Brian F. Feldman" wrote:
 
  In which case the program that consumed all memory will be killed.
  The program killed is +NOT+ the one demanding memory, it's the one
  with most of it.
 
 So why don't we do something else: when we're down to a certain amount of
 backing store, start collecting statistics. When we're out, we check the
 statistics and find what process has been allocating most of it. We kill
 that process.

Because it's not only equally arbitrary but also takes more
resources to implement?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-14 Thread Danny Thomas


Ted Faber [EMAIL PROTECTED]
For every strategy there's a counterstrategy.
exactly: the disappointing thing about this whole thread is there's been
little discussion of implementing a (tunable) policy how to handle the
situation when resource shortage materialises.

Overcommitment can be useful, maybe even for most people...

Killing the biggest is simple to implement and usually right.
... but some people don't want that policy, at least on some of their
systems. Does FreeBSD offer alternatives? Is so, they've been conspicuously
absent from discussion, which might have taken things into a more
productive vein. What do other over-committing systems offer?

Danny Thomas




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Brian F. Feldman


You don't seem to understand that a runaway process/one designed just
to take up memory will be much more active than your little IMAP servers,
and be the one killed, if this scheme were used.

 Brian Fundakowski Feldman  _ __ ___   ___ ___ ___  
 [EMAIL PROTECTED]   _ __ ___ | _ ) __|   \ 
 FreeBSD: The Power to Serve!_ __ | _ \._ \ |) |
   http://www.FreeBSD.org/  _ |___/___/___/ 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread lyndon



 You don't seem to understand that a runaway process/one designed just
 to take up memory will be much more active than your little IMAP servers,
 and be the one killed, if this scheme were used.

No, what I don't understand is how the current behaviour can tell that
my temporary and *valid* need for a large chunk of memory does not make
me a runaway process, and therefore subject to death.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-14 Thread lyndon



 What you don't understand is that no process is going to die unless either
 someone is running away (in which case they'll get the bullet) or your
 system is horribly misconfigured (in which case you deserve your fate).

*Why* the machine is out of memory is not the issue. The issue is
what happens when you run out (whether due to stupidity in configuration
or otherwise).

All of the arguments I've seen so far assume that one process is
running off and grabbing all the available memory. That may be
the most likely scenario, but it's most certainly not the *only*
scenario. What if you have a whole bunch of "middle sized" processes
running, all using memory efficiently, but in total using 95% of
the available VM. A malloc(5*1024*1024) might work, but I need
10 MB instead of 5MB. And my memory footprint is just a little
bit bigger than the other guys. Instead of returning NULL to
the malloc() request, *zap* I'm dead. How can you possibly
call that sensible behaviour?

Yes, the machine is under-resourced. I can't help that -- it's not my
machine. The machine belongs to a customer who happens to run my IMAP
software, who also happens to have ignored our sizing guidelines. In 
this situation I have no choice but to deal with the low memory
condition, and our code does that, if it's given the chance! At
least give me the opportunity to deal with the situation gracefully.

What if we decided to defer errors from bind just because there
weren't any mbufs available, and later killed the process when it
tried to do network I/O? People would howl bloody murder! (== this is
rhetorical, folks)

The semantics of malloc() have been defined since almost the dawn of
time. From the current manpage:

  RETURN VALUES
 The malloc() and calloc() functions return a pointer to the allocated
 memory if successful; otherwise a NULL pointer is returned.

Nowhere does it say that allocated memory might not exist. Nowhere
does it say that I have to touch all the allocated pages to make
sure they are really there. Nowhere does it say process death at
some non-deterministic time in the future might be a side effect 
of calling malloc().

Applications are written assuming that malloc() behaves in the
documented manner. It is *not* acceptable to tell applications writers
that they have to provide their own management routines on top of malloc()
(SEGV catchers and the like) if they want the long standing semantics
of malloc() to be preserved. If the current malloc() cannot behave in
the documented and expected manner it needs to be renamed, because
malloc() it most certainly isn't.

--lyndon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Michael Richardson



 "John" == John Nemeth [EMAIL PROTECTED] writes:
John On one system I administrate, the largest process is typically
John rpc.nisd (the NIS+ server daemon).  Killing that process would be a
John bad thing (TM).  You're talking about killing random processes.
John This is no way to run a system.  It is not possible for any
John arbitrary decision to always hit the correct process.  That is a
John decision that must be made by a competent admin.  This is the
John biggest argument against overcommit: there is no way to gracefully
John recover from an out of memory situation, and that makes for an
John unreliable system.

  No, I don't agree. 

  This is a biggest argument against solving the overcommit situation with
SIGKILL. I have no problem with overcommit as a concept, I have a problem
with being unable to keep my possibly big processes (X, rpc.nisd,
etc. depending on cicumstances) from being victims.

] Train travel features AC outlets with no take-off restrictions|  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON|net architect[
] [EMAIL PROTECTED] http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread John Nemeth


On Jul 15,  2:40am, "Daniel C. Sobral" wrote:
} Garance A Drosihn wrote:
}  At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
}   In which case the program that consumed all memory will be killed.
}   The program killed is +NOT+ the one demanding memory, it's the one
}   with most of it.
}  
}  But that isn't always the best process to have killed off...
} 
} Sure it is. :-) Let's see...

 This statement is absurd.  Only a comptetant admin can decide
which process can be killed.  No arbitrary decision is going to be
correct.

}  It would be nice to have a way to indicate that, a la SIGDANGER.
} 
} Ok, everybody is avoiding this, so I'll comment. Yes, this would be

 The reason I've ignored it, is because SIGDANGER is a hack on top
of a very bad hack.

} interesting, and a good implementation will very probably be
} committed. *BUT*, this is not as useful as it seems. Since the
} correct solution is buy more memory/increase swap (correct solution
} for our target markets, anyway), there is little incentive to
} implement it.

 In case you hadn't noticed, this debate is cross-posted to
NetBSD.  NetBSD's target market isn't the same as FreeBSD's target
market.  This answer is NOT the correct solution for NetBSD's target
market.  Heck, except for one rather vocal person, FreeBSD's target
market may not consider it to be the correct solution either.  I most
certainly do not consider it to be correct, and I admin a lot of
mission critical servers.

}-- End of excerpt from "Daniel C. Sobral"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Michael Richardson



 "Ben" == Ben Rosengart [EMAIL PROTECTED] writes:
Ben On Wed, 14 Jul 1999, John Nemeth wrote:

 On one system I administrate, the largest process is typically
 rpc.nisd (the NIS+ server daemon).  Killing that process would be a
 bad thing (TM).  You're talking about killing random processes.  This
 is no way to run a system.  It is not possible for any arbitrary
 decision to always hit the correct process.  That is a decision that
 must be made by a competent admin.  This is the biggest argument
 against overcommit: there is no way to gracefully recover from an out
 of memory situation, and that makes for an unreliable system.

Ben $DEITY on a pogo stick, how many times do we have to hear the same
Ben hypothetical argument?

Ben Tell me, Mr. Nemeth, has this ever happened to you?  Have you ever
Ben come *close*?

  Uh, since we don't run overcommit, the answer is specifically *NO*.

  We have never had lack of swap space randomly kill one of our processes.
This is good, and this is the way we want to keep it. 

  I have had it happen on other systems. (Solaris, AIX) It was very
mystifying to diagnose. Sure, the systems were misconfigured for what we
were trying to do, but if I wanted build a custom system for every
application well... I'd be running NT.

] Train travel features AC outlets with no take-off restrictions|  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON|net architect[
] [EMAIL PROTECTED] http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Jason Thorpe


On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
 John Baldwin [EMAIL PROTECTED] wrote:

  What does that have to do with overcommit?  I student administrate a undergrad
  CS lab at a university, and when student's programs misbehaved, they generate a
  fault and are killed.  The only machines that reboot on us without be
  explicitly told to are the NT ones, and yes we run FreeBSD.

What does it have to do with overcommit?  Everthing in the world!

If you have a lot of users, all of which have buggy programs which eat
a lot of memory, per-user swap quotas don't necessarily save your butt.

And maybe the individual programs didn't encounter their resource limits.

...but the sheer number of these runaway things caused the overcommit to
be a problem.  If malloc() or whatever had actually returned NULL at the
right time (i.e. as backing store was about to become overcommitted), then
these runaway processes would have stopped running away (they would have
gotten a SIGSEGV and died).

Anyhow, my "lame undergrads" example comes from a time when PCs weren't
really powerful enough for the job (or something; anyhow, we didn't have
any in the department :-).  My example is from a Sequent Balance (16
ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant).

-- Jason R. Thorpe [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon


:On Tue, 13 Jul 1999 23:18:58 -0400 (EDT) 
: John Baldwin [EMAIL PROTECTED] wrote:
:
:  What does that have to do with overcommit?  I student administrate a undergrad
:  CS lab at a university, and when student's programs misbehaved, they generate a
:  fault and are killed.  The only machines that reboot on us without be
:  explicitly told to are the NT ones, and yes we run FreeBSD.
:
:What does it have to do with overcommit?  Everthing in the world!
:
:If you have a lot of users, all of which have buggy programs which eat
:a lot of memory, per-user swap quotas don't necessarily save your butt.

If every single one of your users is trying to crash your machine daily,
maybe you should consider throwing them off the system and finding users
that are less hostile.

This conversation is getting silly.  Do you actually believe that
an operating system can magically protect itself 100% from armloads of 
hostile users?

Give me a break.  You people are crazy.  If you have something worthwhile
to say i'll listen, but these "the sky is falling!" arguments are idiotic.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Sergey Babkin


Garance A Drosihn wrote:
 
 At 12:20 AM +0900 7/15/99, Daniel C. Sobral wrote:
  In which case the program that consumed all memory will be killed.
  The program killed is +NOT+ the one demanding memory, it's the one
  with most of it.
 
 But that isn't always the best process to have killed off...
 
 One of my main freebsd machines is mainly here to run one
 process, which is a pretty good-sized process (40meg).  If
 I did get into a memory-shortage problem, I do *not* want
 that process killed, I'd want some other processes killed.
 
 It would be nice to have a way to indicate that, a la SIGDANGER.

Another option may be to add something like "importance classes".
Suppose we assign an one-byte "importance level" to each process.
When we get out of swap we start killing processes with the lowest
importance level. This seems to be both easy to implement and
a rather robust solution.

It can be extended to more flexible policies: say, we divide
this 8-bit number into two 4-bit fields. The high field
will be "major importance level" the low field will be "importance
sublevel". We permit the user processes to change their
sublevel to any value as long their major level stays the same
or becomes lower. This will allow the users to make differences
between their programs but only in certain limits. The initial
importance level may be set in /etc/login.conf.

One more extension would be to use one bit as the ihneritance
flag: if it is set, the child inherits the importance value
from the parent. But if it's reset the child inherits its
major level from the parent but the sublevel gets reset to 0.
It may be useful for transparent calls of system().
Yet another extension would be to use two separate inheritance
bits: one as described above, the secone one if reset means that 
the importance value of the child must be reset to the lowest one.

-SB


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Garance A Drosihn


At 3:18 PM -0700 7/14/99, Matthew Dillon wrote:
This conversation is getting silly.  Do you actually believe
that an operating system can magically protect itself 100%
from armloads of hostile users?

Give me a break.  You people are crazy.  If you have something
worthwhile to say i'll listen, but these "the sky is falling!"
arguments are idiotic.

Hmm.  I didn't notice any sky-is-falling arguments in this thread,
so I finally started looking around to see why such nasty replies
keep showing up to what I considered reasonable questions...

So, I finally looked back into the "replacement for grep" thread
(which I have been ignoring ever since it stopped talking about
the grep replacement), and I see this topic is being thrashed to
death over there.  I still think there could be some useful
discussion on what I was *trying* to talk about here, but I
guess it will have to wait until some other time given how
exasperated people are getting.


---
Garance Alistair Drosehn   =   [EMAIL PROTECTED]
Senior Systems Programmer  or  [EMAIL PROTECTED]
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon


:For the moment I'll pretend that you honestly think that is an
:answer, and I'll note that the very same machine may have well
:over 100 processes each of which takes 1-2 meg of memory.  If
:the machine hits a really-out-of-memory error, I would be much
:much happier to see all 100+ of those processes killed, at once,
:than the one 40-meg process.
:
:Now tell me how I fix my swap under those circumstances.  If
:the answer is "buy infinite memory (ram or disk)", then we don't
:need any overcommit policy in the first place.  Note that the
:problem might be that these 100 processes start taking up 5 or
:10 meg than the 2 meg I'm used to.

Everything scales.  If the load on your machine is such 
that you have hundreds of processes taking 1-2MB of memory,
then lets assume that such a machine has a reasonable
memory configuration of, say, 256MB of ram, and a reasonable
swap configuration of, say, 1GB.  Under normal operating
conditions perhaps 100MB might be swapped out, giving you
900MB of margin.  The actual VM footprint on such a machine
might run on the order of 10 GB (rough guess) of which 350MB 
or so has actually been allocated).

With 900MB of margin - which I might add is only about $30 worth 
of disk space, and reasonable process limits, it seems highly
unlikely that the machine will ever run out of swap, even
if a user makes an honest mistake.  I also rather seriously
doubt that a hostile user would have any more or less success
blowing away your process with the non-overcommit model verses
otherwise.

If 1G isn't enough, spend another $30 and throw 2G of swap
online.  Or perhaps dedicate an entire $150 disk and throw
6+ GB of swap online.

The equivalent setup using a non-overcommit model would require
considerably more swap to have the same reliability.  Plus
you have to realize that with either model if you are talking
about saving your work, the same code that does the save-and-exit
in the non-overcommit model can just as easily do a checkpoint
once an hour in the standard overcommit model.  Code that
can't save/checkpoint would not survive either model.

Disk is cheap.  Memory isn't (though it's getting better).
Everything scales.

:I didn't mean to be casting asperisions on the general idea of
:overcommitting, or whatever it is that has your shorts all tied
:up in a knot.
:
:---
:Garance Alistair Drosehn   =   [EMAIL PROTECTED]
:Senior Systems Programmer  or  [EMAIL PROTECTED]

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-14 Thread Mark Newton


[EMAIL PROTECTED] wrote:

  The semantics of malloc() have been defined since almost the dawn of
  time. From the current manpage:
RETURN VALUES
   The malloc() and calloc() functions return a pointer to the allocated
   memory if successful; otherwise a NULL pointer is returned.
  Nowhere does it say that allocated memory might not exist. Nowhere
  does it say that I have to touch all the allocated pages to make
  sure they are really there. Nowhere does it say process death at
  some non-deterministic time in the future might be a side effect 
  of calling malloc().

It's just using a different definition of "successful return of malloc()"
to the one you're trying to use :-)

  - mark


Mark Newton   Email:  [EMAIL PROTECTED] (W)
Network Engineer  Email:  [EMAIL PROTECTED]  (H)
Internode Systems Pty Ltd Desk:   +61-8-82232999
"Network Man" - Anagram of "Mark Newton"  Mobile: +61-416-202-223


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


John Nemeth wrote:
 
 }  But that isn't always the best process to have killed off...
 }
 } Sure it is. :-) Let's see...
 
  This statement is absurd.  Only a comptetant admin can decide
 which process can be killed.  No arbitrary decision is going to be
 correct.

We are talking about what process the OS should kill automatically
when it reaches this situation. What is the criteria that should be
used? Is the "biggest process" the "best" process to be killed? Or
is there another, better criteria?

In this context, the statement makes perfect sense, even if you
disagree with it.

 } interesting, and a good implementation will very probably be
 } committed. *BUT*, this is not as useful as it seems. Since the
 } correct solution is buy more memory/increase swap (correct solution
 } for our target markets, anyway), there is little incentive to
 } implement it.
 
  In case you hadn't noticed, this debate is cross-posted to
 NetBSD.  NetBSD's target market isn't the same as FreeBSD's target
 market.  This answer is NOT the correct solution for NetBSD's target
 market.  Heck, except for one rather vocal person, FreeBSD's target
 market may not consider it to be the correct solution either.  I most
 certainly do not consider it to be correct, and I admin a lot of
 mission critical servers.

I noticed, but I do not speak for NetBSD. Well, I do not speak for
FreeBSD either, but I have well informed opinions on it. What I say,
I say about FreeBSD.

As for being "correct", it's really simple. Either you have enough
memory, or you do not. If you don't have enough memory, a number of
programs cannot function correctly. Sure, some programs might be
able to deal with low-memory situations, but *other* programs
*cannot* deal with it. It's impossible for them to accomplish their
tasks if there is not enough memory. So, if you want that server to
accomplish it's job, you need more memory.

Which, btw, is cheaper than the man-hours needed to implement
SIGDANGER.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


Sergey Babkin wrote:
 
  It would be nice to have a way to indicate that, a la SIGDANGER.
 
 Another option may be to add something like "importance classes".
 Suppose we assign an one-byte "importance level" to each process.
 When we get out of swap we start killing processes with the lowest
 importance level. This seems to be both easy to implement and
 a rather robust solution.

This is as easy to do as setting limits, which has the added benefit
of not having any process killed.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Matthew Dillon


:Our IMAP server routinely show a footprint of about 1MB private storage.
:This is constant for most operations. However, when you get into doing
:SEARCH and SORT, there are certain cases where we need memory, sometimes
:a *lot* of memory.
:
:Your proposal is that my *well behaved* application should be arbitrarily
:killed, leaving the client stuck with a) no results and b) no IMAP
:connection, in this situation. (And think threaded. That one server
:could be handling *hundreds* of clients.) This is preferable to
:returning a NULL to the malloc() request, which I can handle
:gracefully by simply returning a NO response to the IMAP client?
:
:What it so evil about having a reasonably intelligent malloc() that
:tells the truth, and returns unused memory to the system? Overcommit
:is for lazy programmers, plain and simple. At least the SGI documentation
:about overcommit admits that (or at least, did at one time).
:
:--lyndon

If you are running an IMAP server that regularly runs out of swap
space, you have a configuration problem which needs to be addressed.
It's as simple as that.  What you are putting forth is an example
of something that will never happen on a properly configured 
server.

In regards to the general case where one is running third-party 
applications.  Here you are assuming that you can go in and modify
every single piece of software running on the machine to deal
with malloc() returning NULL.  Because if you don't, the machine
isn't going to be very stable.

Not only that, you are assuming that you will make the correct
decision on what action to take when malloc() *does* return NULL.
If you decide to return an error code but not exit, what happens
when a potential blowup situation results in thousands of imap
processes being run on the system, and NONE of them exit when
their malloc() fails?

The problem is a whole lot more complex then simply having the
OS return NULL from a malloc().  Currently the OS kills processes
as a last resort.  The idea is that no nominally running system
runs out of swap.  Now you propose to take away the kernel's
ability to recover some memory as a last resort and instead
put it into the hands of the very user or root-run processes
that are causing the problem in the first place!  A much better
solution would be to write a simple watchdog script that notices
when swap space is low and does the right thing -- e.g. kills
the non-essential processes and leaves the essential ones alone.
Then the kernel never actually reaches a state of last-resort.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit

1999-07-14 Thread Matthew Dillon


:Ted Faber [EMAIL PROTECTED]
:For every strategy there's a counterstrategy.
:exactly: the disappointing thing about this whole thread is there's been
:little discussion of implementing a (tunable) policy how to handle the
:situation when resource shortage materialises.
:
:Overcommitment can be useful, maybe even for most people...
:
:Killing the biggest is simple to implement and usually right.
:... but some people don't want that policy, at least on some of their
:systems. Does FreeBSD offer alternatives? Is so, they've been conspicuously
:absent from discussion, which might have taken things into a more
:productive vein. What do other over-committing systems offer?
:
:Danny Thomas

Here's an alternative:

whlie (1)
sleep 60
blah blah blah run pstat -s, get available swap.
if available swap  200MB then
blow away some non-critical processes
if no non-critical processes remain
blow away everything not owned by root and
yell for help.
if no no-root processes remain
do nothing. let the kernel blow away 
the biggest process when swap actually
runs out.
endif
endif
endif
end

How long do you suppose it would take to actually write that
script?  One hour?  Two hours?  Not long, I think.

Problem solved.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


John Nemeth wrote:
 
  On one system I administrate, the largest process is typically
 rpc.nisd (the NIS+ server daemon).  Killing that process would be a
 bad thing (TM).  You're talking about killing random processes.  This
 is no way to run a system.  It is not possible for any arbitrary
 decision to always hit the correct process.  That is a decision that
 must be made by a competent admin.  This is the biggest argument
 against overcommit:  there is no way to gracefully recover from an
 out of memory situation, and that makes for an unreliable system.

If you run out of memory, it is either a misconfigured system, or a
runaway program. If a program is runaway, then:

1) It is larger than your typical rpc.nisd.
2) You cannot tell the system a priori to kill it, because you don't
know about it (or else, you wouldn't be running it in first place).

A system running in overcommit assumes that you have it correctly
configured so it will *not* run out of memory under normal
conditions. This happens to be the same assumption Unix does.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


Michael Richardson wrote:
 
 Ben Tell me, Mr. Nemeth, has this ever happened to you?  Have you ever
 Ben come *close*?
 
   Uh, since we don't run overcommit, the answer is specifically *NO*.

And what system do you run?

   I have had it happen on other systems. (Solaris, AIX) It was very
 mystifying to diagnose. Sure, the systems were misconfigured for what we
 were trying to do, but if I wanted build a custom system for every
 application well... I'd be running NT.

I have to agree about the mystifying diagnose... Specially when they
*don't* page like hell.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


[EMAIL PROTECTED] wrote:
 
 What it so evil about having a reasonably intelligent malloc() that
 tells the truth, and returns unused memory to the system? Overcommit
 is for lazy programmers, plain and simple. At least the SGI documentation
 about overcommit admits that (or at least, did at one time).

Yes. So is high-level languages, as a matter of fact. True
memory-conscious programmers will never use anything besides
assembler.

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Daniel C. Sobral


Jason Thorpe wrote:
 
 If you have a lot of users, all of which have buggy programs which eat
 a lot of memory, per-user swap quotas don't necessarily save your butt.

The chance of these buggy programs running at the same time is not
exactly high...

 And maybe the individual programs didn't encounter their resource limits.
 
 ...but the sheer number of these runaway things caused the overcommit to
 be a problem.  If malloc() or whatever had actually returned NULL at the
 right time (i.e. as backing store was about to become overcommitted), then
 these runaway processes would have stopped running away (they would have
 gotten a SIGSEGV and died).
 
 Anyhow, my "lame undergrads" example comes from a time when PCs weren't
 really powerful enough for the job (or something; anyhow, we didn't have
 any in the department :-).  My example is from a Sequent Balance (16
 ns32032 processors, 64M RAM [I think; been a while], 4.2BSD variant).

So, tell me... when NetBSD gets it's non-overcommit switch, would
you use it in the environment you describe?

--
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]

"Would you like to go out with me?"
"I'd love to."
"Oh, well, n... err... would you?... ahh... huh... what do I do
next?"



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Re: Swap overcommit (was Re: Replacement for grep(1) (part 2))

1999-07-14 Thread Tim Vanderhoek


On Thu, Jul 15, 1999 at 01:48:40PM +0900, Daniel C. Sobral wrote:
  
  If you have a lot of users, all of which have buggy programs which eat
  a lot of memory, per-user swap quotas don't necessarily save your butt.
 
 The chance of these buggy programs running at the same time is not
 exactly high...

Well, it is higher than your probably giving credit for.  Suppose
Professor A. hands-out X assignment.  Unfortunately, some piece of
code he supplied to his, let's say 200 students ignorant first year
students, has this particular memory-eating bug.  Being ignorant
first-year students, they will notice something is wrong, assume
the problem is their fault, and repeat the exact same procedure
five or so times.  Again, being ignorant first year students, they
will probably all be using the same shell server.

To make things worse, some wise-ass may have told a bunch of them how
to use ulimit or limit in order to push their available resources as
high as possible (perhaps very high, since the admin hopefully
recognizes that sometimes students need high resource limits to
perform research).

Fortunately, overcommit rescues the machine and kills those buggy
programs instead of letting them spin around for ever in some kind of
"malloc() failed ... must be temporary failure, wait and retry".


-- 
This is my .signature which gets appended to the end of my messages.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

1 2 >

1 - 100 of 153 matches

Mail list logo