Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-24 Thread Vallo Kallaste

On Tue, Apr 23, 2002 at 09:40:11PM -0700, David Schultz
[EMAIL PROTECTED] wrote:

  Userspace processes will allocate memory from UVA space and can
  grow over 1GB of size if needed by swapping.  You can certainly
  have more than one over-1GB process going on at the same time,
  but swapping will constrain your performance.
 
 It isn't a performance constraint.  32-bit architectures have
 32-bit pointers, so in the absence of segmentation tricks, a
 virtual address space can only contain 2^32 = 4G locations.  If
 the kernel gets 3 GB of that, the maximum amount of memory that
 any individual user process can use is 1 GB.  If you had, say, 4
 GB of physical memory, a single user process could not use it all.
 Swap increases the total amount of memory that *all* processes can
 allocate by pushing some of the pages out of RAM and onto the
 disk, but it doesn't increase the total amount of memory that a
 single process can address.

Thank you, Terry and David, now I grasp how it should work (I hope).
I really miss some education, but that's life.
-- 

Vallo Kallaste
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-24 Thread Terry Lambert

David Schultz wrote:
 Thus spake Terry Lambert [EMAIL PROTECTED]:
  Writing a useful (non-fluff) technical book, optimistically,
  takes 2080 hours ... or 40 hours per week for 52 weeks... a man
  year.
 
  By the time you are done, the book is a year out of date, and
  even if you worked really hard and kept it up to date (e.g. you
  had 4 authors and spent only 6 months of wall time on the book),
  the shelf life on the book is still pretty short.
 
 Although it would be unreasonable to comprehensively document the
 kernel internals and expect the details to remain valid for a year,
 there is a great deal of lasting information that could be conveyed.
 For example, Kirk's 4.[34]BSD books cover obsolete systems, and yet
 much of what they say applies equally well to recent versions of
 FreeBSD.

These are general OS architecture books by a noted authority on
OS architecture.  That's a barrier to entry for other authors,
as the intrinsic value in the information is not constrained to
the direct subject of the work.  8-).

Kirk is supposedly working on a similar book for FreeBSD, release
date indeterminate.

In any case, this doesn't resolve the issue of Where do I go to
do XXX to version YYY, without having to learn everything there is
to know about YYY?.


 It's true that the specific question ``How do I change my KVA size?''
 might have different answers at different times, but I doubt that the
 ideas behind an answer have all been invented in the last few months.
 Even things like PAE, used by the Linux 2.4 kernel, remind me of how
 DOS dealt with the 1 MB memory limit.

The PAE is the thing that Peter was reportedly working on in order
to break the 4G barrier on machines capable of accessing up to 16G
of RAM using bank selection.  I didn't mention it by name, since
the general principle is also applicable to the Alpha, which has a
current limit of 2G because of DMA barrier and other constraints.


While it's true that the ideas behind the answer remain the same...
the ideas behind the answer are already published in the books I've
already referenced in the context of this thread.

If people were content to discover implementation details based on
a working knowledge of general principles, then this thread would
never have occurred in the first place.


It's my opinion that people are wanting to do more in depth things
to the operating system, and that there is a latency barrier in the
way of them doing this.  My participation in this discussion, and in
particular, with regard to the publication of thorough and useful
documentation, has really been around this point.


-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-23 Thread Vallo Kallaste

On Tue, Apr 23, 2002 at 09:44:50AM -0300, Marc G. Fournier
[EMAIL PROTECTED] wrote:

 Next, again, if I'm reading this right ... if I set my KVA to 3G,
 when the system boots, it will reserve 3G of *physical* RAM for
 the kernel itself, correct?  So on a 4G machine, 1G of *physical*
 RAM will be available for UVAs ... so, if I run 1G worth of
 processes, that is where swapping to disk comes in, right?  Other
 then the massive performance hit, and the limit you mention about
 some parts of UVA not being swappable, I could theoretically have
 4G of swap to page out to?

You can have up to ~12GB of usable swap space, as I've heard. Don't
remember why such arbitrary limit, unfortunately. Information about
such topics is spread over several lists arhives, usually the
subjects are strange, too.. so hard to find out. As I understand it
you are on the track, having 3GB allocated to KVA means 1GB for UVA,
whatever it exactly means. Userspace processes will allocate memory
from UVA space and can grow over 1GB of size if needed by swapping.
You can certainly have more than one over-1GB process going on at
the same time, but swapping will constrain your performance.
I'm sure Terry or some other knowledgeable person will correct me if
it doesn't make sense.

 Is there a reason why this stuff isn't auto-scaled based on RAM as
 it is?

Probably lack of manpower, to code it up you'll have to understand
every bit of it, but as we currently see, we don't understand it,
probably many others as well :-)
-- 

Vallo Kallaste
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-23 Thread Terry Lambert

Vallo Kallaste wrote:
 You can have up to ~12GB of usable swap space, as I've heard. Don't
 remember why such arbitrary limit, unfortunately. Information about
 such topics is spread over several lists arhives, usually the
 subjects are strange, too.. so hard to find out. As I understand it
 you are on the track, having 3GB allocated to KVA means 1GB for UVA,
 whatever it exactly means. Userspace processes will allocate memory
 from UVA space and can grow over 1GB of size if needed by swapping.
 You can certainly have more than one over-1GB process going on at
 the same time, but swapping will constrain your performance.
 I'm sure Terry or some other knowledgeable person will correct me if
 it doesn't make sense.

Actually, you have a total concurrent virtual address space of 4G.

If you assign 3G of that to KVA, then you can never exceed 1G of
space for a user process, under any circumstances.

This is because a given user process and kernel must be able
to exist simultaneously in order to do things like copyin/copyout.


  Is there a reason why this stuff isn't auto-scaled based on RAM as
  it is?
 
 Probably lack of manpower, to code it up you'll have to understand
 every bit of it, but as we currently see, we don't understand it,
 probably many others as well :-)

A lot of things are autosized.  Matt Dillon made some recent changes
in this regard.  But many things happen based on expected usage.

You can't auto-size the KVA because the kernel is relocated at
the base of the KVA space.  As far as it's concerned, it's
loaded at 1M, and as far as processes are concerned, they're
loaded in low memory.

The main barrier to autosizing things so that expect usage is
not an issue is that you have to preallocate page mappings, if
not physical pages to back them, at boot time, for anything
that can be allocated at interrupt time (e.g. mbufs).

The other barrier here is that some things are grouped together
that probably out to be seperate: e.g. maxfiles controls inpcb,
tcpcb, and udpcb allocation, which occurs at boot time, as well
as other limits that occur at runtime.  So using the sysctl at
runtime doesn't adjust everything you think it does, but doing it
in /boot/loader.conf does.  For the same reason, doing things
like setting hash table sizes, and then adjusting things larger
doesn't really work out very well, either.

It really does boil down to understanding every bit of it... and
the lack of resources to help you do that.

Bryan Costales is not just a lazy butt... there are good,
economic reasons that there isn't yet an updated sendmail book
covering more recent versions of the sendmail program.

Writing a useful (non-fluff) technical book, optimistically,
takes 2080 hours ... or 40 hours per week for 52 weeks... a man
year.

By the time you are done, the book is a year out of date, and
even if you worked really hard and kept it up to date (e.g. you
had 4 authors and spent only 6 months of wall time on the book),
the shelf life on the book is still pretty short.

The recent How do I change my KVA size? question is a good
example: a book that came out six months ago would have needed
four revisions, based on the changes to the config program
and the code involved, and would be out of date for 4.5-STABLE,
5.0-RELEASE, and 4.6-RELEASE (when it comes out).

At that point, the online version's addenda/errata is so much
more useful than the book itself, that there's really no good
justification for buying the bok instead of just looking at the
online information: it's a totally different set of information.

If FreeBSD wants someone to write an in depth book, it's got
to have a commitment to not change some basic principles and
code for long enough for the book to be useful with something
more than just a really old CDROM included in the book itself.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-23 Thread Vallo Kallaste

On Tue, Apr 23, 2002 at 12:25:31PM -0700, Terry Lambert
[EMAIL PROTECTED] wrote:

 Vallo Kallaste wrote:
  You can have up to ~12GB of usable swap space, as I've heard. Don't
  remember why such arbitrary limit, unfortunately. Information about
  such topics is spread over several lists arhives, usually the
  subjects are strange, too.. so hard to find out. As I understand it
  you are on the track, having 3GB allocated to KVA means 1GB for UVA,
  whatever it exactly means. Userspace processes will allocate memory
  from UVA space and can grow over 1GB of size if needed by swapping.
  You can certainly have more than one over-1GB process going on at
  the same time, but swapping will constrain your performance.
  I'm sure Terry or some other knowledgeable person will correct me if
  it doesn't make sense.
 
 Actually, you have a total concurrent virtual address space of 4G.
 
 If you assign 3G of that to KVA, then you can never exceed 1G of
 space for a user process, under any circumstances.
 
 This is because a given user process and kernel must be able
 to exist simultaneously in order to do things like copyin/copyout.

Hmm, ok, but can we have more than one 1G user process at one time?
Four 500MB ones and so on?
Somehow I've made such conclusion based on previous information.
Should be so, otherwise I don't understand how swapping will fit
into overall picture.
-- 

Vallo Kallaste
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-23 Thread Dave Hayes

Terry Lambert (who fits my arbitrary definition of a good cynic)
writes:
 It's a hazard of Open Source projects, in general, that there are
 so many people hacking on whatever they think is cool that nothing
 ever really gets built to a long term design plan that's stable
 enough that a book stands a chance of having a 1 year lifetime.

I could not help but notice your multiple attempts at expresing this
particular concept often, that is...an implied necessity of a book
that explains what's going on under the kernel hood. I agree that such
a book would rapidly be out of date, but I also see the necessity
thereof. 

So, it's time to question the assumption that the information you want
available should be in a book.

Many websites have annotation as a form of ad-hoc documentation
(e.g. php.net). Why not have someone take a crack at documenting the
FreeBSD kernel, and perhaps use some annotation feature to create a
living document which (hopefully) comes close to describing the
kernel architechture?

If you want to track a moving target, perhaps you need to use a moving
track? 
--
Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] 
 The opinions expressed above are entirely my own 

What's so special about the Net? People -still- don't
listen...
  -The Unknown Drummer







To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-23 Thread David Schultz

Thus spake Terry Lambert [EMAIL PROTECTED]:
 Writing a useful (non-fluff) technical book, optimistically,
 takes 2080 hours ... or 40 hours per week for 52 weeks... a man
 year.
 
 By the time you are done, the book is a year out of date, and
 even if you worked really hard and kept it up to date (e.g. you
 had 4 authors and spent only 6 months of wall time on the book),
 the shelf life on the book is still pretty short.

Although it would be unreasonable to comprehensively document the
kernel internals and expect the details to remain valid for a year,
there is a great deal of lasting information that could be conveyed.
For example, Kirk's 4.[34]BSD books cover obsolete systems, and yet
much of what they say applies equally well to recent versions of
FreeBSD.

It's true that the specific question ``How do I change my KVA size?''
might have different answers at different times, but I doubt that the
ideas behind an answer have all been invented in the last few months.
Even things like PAE, used by the Linux 2.4 kernel, remind me of how
DOS dealt with the 1 MB memory limit.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-22 Thread Vizion Communication

Test - Please ignore
- Original Message - 
From: David Schultz [EMAIL PROTECTED]
To: Terry Lambert [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Monday, April 22, 2002 6:09 AM
Subject: Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?


 Thus spake Terry Lambert [EMAIL PROTECTED]:
  If you want more, then you need to use a 64 bit processor (or use a
  processor that supports bank selection, and hack up FreeBSD to do
  bank swapping on 2G at a time, just like Linux has been hacked up,
  and expect that it won't be very useful).
 
 I'm guessing that this just means looking at more than 4 GB of memory
 by working with 2 GB frames at a time.  As I recall, David Greenman
 said that this hack would essentially require a rewrite of the VM
 system.  Does this just boil down to using 36 bit physical addresses?
 Are there plans for FreeBSD to support it, or is everyone just waiting
 until 64 bit processors become more common?
 
  You can't
  really avoid that, for the most part, since there's a shared TLB
  cache that you really don't have opportunity to manage, other than
  by seperating 4M vs. 4K pages (and 2M, etc., for the Pentium Pro,
  though variable page granularity is not supported in FreeBSD, since
  it's not common to most hardware people actually have).
 
 Does FreeBSD use 4M pages exclusively for kernel memory, as in
 Solaris, or is there a more complicated scheme?
 
  If you increase the KVA, then you will decrease the UVA available to
  user processes.  The total of the two can not exceed 4G.
 
 In Linux, all of physical memory is mapped into the kernel's virtual
 address space, and hence, until recently Linux was limited to ~3 GB of
 physical memory.  FreeBSD, as I understand, doesn't do that.  So is
 the cause of this limitation that the top half of the kernel has to
 share a virtual address space with user processes?
 
 I'll have to read those books one of these days when I have time(6).
 Thanks for the info.
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-stable in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-20 Thread Marc G. Fournier


Over the past week, I've been trying to get information on how to fix a
server that panics with:

| panic: vm_map_entry_create: kernel resources exhausted
| mp_lock = 0101; cpuid = 1; lapic.id = 0100
| boot() called on cpu#1

Great ... but, how do I determine what 'resources' I need to increase to
avoid that crash?  I've tried increasing maxusers from 512-1024, but *if*
that works, I imagine I'm raising a bunch of limits (and using memory)
that I don't have to ...

The server is a Dual-CPU PIII-1Ghz with 3Gig of RAM and ~3Gig of swap
space right now ... the data drive is 5x18gig drives in a RAID5
configuration (hardware RAID, not vinum) ...

I ran top in an xterm so that I could see what was up just before the
crash, and the results were:

last pid: 84988;  load averages: 19.82, 57.35, 44.426   up 0+23:33:12 02:05:00
5021 processes:16 running, 5005 sleeping
CPU states:  8.7% user,  0.0% nice, 24.3% system,  2.2% interrupt, 64.7% idle
Mem: 2320M Active, 211M Inact, 390M Wired, 92M Cache, 199M Buf, 4348K Free
Swap: 3072M Total, 1048M Used, 2024M Free, 34% Inuse, 448K Out

So, I have plenty of swapspace left, lots of idle CPU and a whole
whack of processes ...

Now, looking at the LINT file, there appears to be *alot* of
things I *could* change ... for instance, NSFBUFS, KVA_FILES, etc ... but
I don't imagine that changing these blindly is particularly wise ... so,
how do you determine what to change?  For instance, at a maxusers of 512,
NSFBUFS should be ~8704, and if I've only got 5000 processes running,
chances are I'm still safe at that value, no?  But sysctl doesn't show any
'sf_buf' value, so how do I figure out what I'm using?

Basically, are there any commands similar to netstat -m for
nmbclusters that I can run to 'monitor' and isolate where I'm exhausting
these resources?

Is there a doc on this sort of stuff that I should be reading for
this?  Something that talks about kernel tuning for high-load/processes
servers?

Thanks for any help in advance ..

---
machine i386
cpu I686_CPU
ident   kernel
maxusers1024

options NMBCLUSTERS=15360

options INET#InterNETworking
options INET6   #IPv6 communications protocols
options FFS #Berkeley Fast Filesystem
options FFS_ROOT#FFS usable as root device [keep this!]
options SOFTUPDATES #Enable FFS soft updates support
options PROCFS  #Process filesystem
options COMPAT_43   #Compatible with BSD 4.3 [KEEP THIS!]
options SCSI_DELAY=15000#Delay (in ms) before probing SCSI
options KTRACE  #ktrace(1) support

options SYSVSHM
options SHMMAXPGS=98304
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)

options SYSVSEM
options SEMMNI=2048
options SEMMNS=4096

options SYSVMSG #SYSV-style message queues

options P1003_1B#Posix P1003_1B real-time extensions
options _KPOSIX_PRIORITY_SCHEDULING
options ICMP_BANDLIM#Rate limit bad replies

options SMP # Symmetric MultiProcessor Kernel
options APIC_IO # Symmetric (APIC) I/O

device  isa
device  pci

device  scbus   # SCSI bus (required)
device  da  # Direct Access (disks)
device  sa  # Sequential Access (tape etc)
device  cd  # CD
device  pass# Passthrough device (direct SCSI access)

device  amr # AMI MegaRAID
device  sym

device  atkbdc0 at isa? port IO_KBD
device  atkbd0  at atkbdc? irq 1 flags 0x1
device  psm0at atkbdc? irq 12

device  vga0at isa?

pseudo-device   splash

device  sc0 at isa? flags 0x100

device  npx0at nexus? port IO_NPX irq 13

device  sio0at isa? port IO_COM1 flags 0x10 irq 4
device  sio1at isa? port IO_COM2 irq 3

device  miibus  # MII bus support
device  fxp # Intel EtherExpress PRO/100B (82557, 82558)

pseudo-device   loop# Network loopback
pseudo-device   ether   # Ethernet support
pseudo-device   pty 256 # Pseudo-ttys (telnet etc)
pseudo-device   gif # IPv6 and IPv4 tunneling
pseudo-device   faith   1   # IPv6-to-IPv4 relaying (translation)

pseudo-device   bpf #Berkeley packet filter





To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-20 Thread Mike Grissom

If you are using 4.5 then you should probably use MAXUSERS 0 and remove the
NMBCLUSTERS that enabled auto scalling and should up the NMBCLUSTERS and
other sysctl parms when they are needed to be higher and it bases the
settings on how much ram you have.

- Original Message -
From: Marc G. Fournier [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Saturday, April 20, 2002 3:14 PM
Subject: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?



 Over the past week, I've been trying to get information on how to fix a
 server that panics with:

 | panic: vm_map_entry_create: kernel resources exhausted
 | mp_lock = 0101; cpuid = 1; lapic.id = 0100
 | boot() called on cpu#1

 Great ... but, how do I determine what 'resources' I need to increase to
 avoid that crash?  I've tried increasing maxusers from 512-1024, but *if*
 that works, I imagine I'm raising a bunch of limits (and using memory)
 that I don't have to ...

 The server is a Dual-CPU PIII-1Ghz with 3Gig of RAM and ~3Gig of swap
 space right now ... the data drive is 5x18gig drives in a RAID5
 configuration (hardware RAID, not vinum) ...

 I ran top in an xterm so that I could see what was up just before the
 crash, and the results were:

 last pid: 84988;  load averages: 19.82, 57.35, 44.426   up 0+23:33:12
02:05:00
 5021 processes:16 running, 5005 sleeping
 CPU states:  8.7% user,  0.0% nice, 24.3% system,  2.2% interrupt, 64.7%
idle
 Mem: 2320M Active, 211M Inact, 390M Wired, 92M Cache, 199M Buf, 4348K Free
 Swap: 3072M Total, 1048M Used, 2024M Free, 34% Inuse, 448K Out

 So, I have plenty of swapspace left, lots of idle CPU and a whole
 whack of processes ...

 Now, looking at the LINT file, there appears to be *alot* of
 things I *could* change ... for instance, NSFBUFS, KVA_FILES, etc ... but
 I don't imagine that changing these blindly is particularly wise ... so,
 how do you determine what to change?  For instance, at a maxusers of 512,
 NSFBUFS should be ~8704, and if I've only got 5000 processes running,
 chances are I'm still safe at that value, no?  But sysctl doesn't show any
 'sf_buf' value, so how do I figure out what I'm using?

 Basically, are there any commands similar to netstat -m for
 nmbclusters that I can run to 'monitor' and isolate where I'm exhausting
 these resources?

 Is there a doc on this sort of stuff that I should be reading for
 this?  Something that talks about kernel tuning for high-load/processes
 servers?

 Thanks for any help in advance ..

 ---
 machine i386
 cpu I686_CPU
 ident kernel
 maxusers 1024

 options NMBCLUSTERS=15360

 options INET #InterNETworking
 options INET6 #IPv6 communications protocols
 options FFS #Berkeley Fast Filesystem
 options FFS_ROOT #FFS usable as root device [keep this!]
 options SOFTUPDATES #Enable FFS soft updates support
 options PROCFS #Process filesystem
 options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!]
 options SCSI_DELAY=15000 #Delay (in ms) before probing SCSI
 options KTRACE #ktrace(1) support

 options SYSVSHM
 options SHMMAXPGS=98304
 options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)

 options SYSVSEM
 options SEMMNI=2048
 options SEMMNS=4096

 options SYSVMSG #SYSV-style message queues

 options P1003_1B #Posix P1003_1B real-time extensions
 options _KPOSIX_PRIORITY_SCHEDULING
 options ICMP_BANDLIM #Rate limit bad replies

 options SMP # Symmetric MultiProcessor Kernel
 options APIC_IO # Symmetric (APIC) I/O

 device isa
 device pci

 device scbus # SCSI bus (required)
 device da # Direct Access (disks)
 device sa # Sequential Access (tape etc)
 device cd # CD
 device pass # Passthrough device (direct SCSI access)

 device amr # AMI MegaRAID
 device  sym

 device atkbdc0 at isa? port IO_KBD
 device atkbd0 at atkbdc? irq 1 flags 0x1
 device psm0 at atkbdc? irq 12

 device vga0 at isa?

 pseudo-device splash

 device sc0 at isa? flags 0x100

 device npx0 at nexus? port IO_NPX irq 13

 device sio0 at isa? port IO_COM1 flags 0x10 irq 4
 device sio1 at isa? port IO_COM2 irq 3

 device miibus # MII bus support
 device fxp # Intel EtherExpress PRO/100B (82557, 82558)

 pseudo-device loop # Network loopback
 pseudo-device ether # Ethernet support
 pseudo-device pty 256 # Pseudo-ttys (telnet etc)
 pseudo-device gif # IPv6 and IPv4 tunneling
 pseudo-device faith 1 # IPv6-to-IPv4 relaying (translation)

 pseudo-device bpf #Berkeley packet filter





 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-stable in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-20 Thread Marc G. Fournier



As a quick follow-up to this, doing more searching on the web, I came
across a few suggested 'sysctl' settings, which I've added to what I had
before, for a total of:

kern.maxfiles=65534
jail.sysvipc_allowed=1
vm.swap_idle_enabled=1
vfs.vmiodirenable=1
kern.ipc.somaxconn=4096

I've also just reduced my maxusers to 256 from 1024, since 1024 was
crashing worse then 512, and I ran across the 'tuning' man page that
stated that you shouldn't go above 256 :(

Just a bit more detail on the setup ...

On Sat, 20 Apr 2002, Marc G. Fournier wrote:


 Over the past week, I've been trying to get information on how to fix a
 server that panics with:

 | panic: vm_map_entry_create: kernel resources exhausted
 | mp_lock = 0101; cpuid = 1; lapic.id = 0100
 | boot() called on cpu#1

 Great ... but, how do I determine what 'resources' I need to increase to
 avoid that crash?  I've tried increasing maxusers from 512-1024, but *if*
 that works, I imagine I'm raising a bunch of limits (and using memory)
 that I don't have to ...

 The server is a Dual-CPU PIII-1Ghz with 3Gig of RAM and ~3Gig of swap
 space right now ... the data drive is 5x18gig drives in a RAID5
 configuration (hardware RAID, not vinum) ...

 I ran top in an xterm so that I could see what was up just before the
 crash, and the results were:

 last pid: 84988;  load averages: 19.82, 57.35, 44.426   up 0+23:33:12 02:05:00
 5021 processes:16 running, 5005 sleeping
 CPU states:  8.7% user,  0.0% nice, 24.3% system,  2.2% interrupt, 64.7% idle
 Mem: 2320M Active, 211M Inact, 390M Wired, 92M Cache, 199M Buf, 4348K Free
 Swap: 3072M Total, 1048M Used, 2024M Free, 34% Inuse, 448K Out

   So, I have plenty of swapspace left, lots of idle CPU and a whole
 whack of processes ...

   Now, looking at the LINT file, there appears to be *alot* of
 things I *could* change ... for instance, NSFBUFS, KVA_FILES, etc ... but
 I don't imagine that changing these blindly is particularly wise ... so,
 how do you determine what to change?  For instance, at a maxusers of 512,
 NSFBUFS should be ~8704, and if I've only got 5000 processes running,
 chances are I'm still safe at that value, no?  But sysctl doesn't show any
 'sf_buf' value, so how do I figure out what I'm using?

   Basically, are there any commands similar to netstat -m for
 nmbclusters that I can run to 'monitor' and isolate where I'm exhausting
 these resources?

   Is there a doc on this sort of stuff that I should be reading for
 this?  Something that talks about kernel tuning for high-load/processes
 servers?

   Thanks for any help in advance ..

 ---
 machine   i386
 cpu   I686_CPU
 ident kernel
 maxusers  1024

 options   NMBCLUSTERS=15360

 options   INET#InterNETworking
 options   INET6   #IPv6 communications protocols
 options   FFS #Berkeley Fast Filesystem
 options   FFS_ROOT#FFS usable as root device [keep this!]
 options   SOFTUPDATES #Enable FFS soft updates support
 options   PROCFS  #Process filesystem
 options   COMPAT_43   #Compatible with BSD 4.3 [KEEP THIS!]
 options   SCSI_DELAY=15000#Delay (in ms) before probing SCSI
 options   KTRACE  #ktrace(1) support

 options SYSVSHM
 options SHMMAXPGS=98304
 options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)

 options SYSVSEM
 options SEMMNI=2048
 options SEMMNS=4096

 options SYSVMSG #SYSV-style message queues

 options   P1003_1B#Posix P1003_1B real-time extensions
 options   _KPOSIX_PRIORITY_SCHEDULING
 options   ICMP_BANDLIM#Rate limit bad replies

 options   SMP # Symmetric MultiProcessor Kernel
 options   APIC_IO # Symmetric (APIC) I/O

 deviceisa
 devicepci

 devicescbus   # SCSI bus (required)
 deviceda  # Direct Access (disks)
 devicesa  # Sequential Access (tape etc)
 devicecd  # CD
 devicepass# Passthrough device (direct SCSI access)

 deviceamr # AMI MegaRAID
 device  sym

 deviceatkbdc0 at isa? port IO_KBD
 deviceatkbd0  at atkbdc? irq 1 flags 0x1
 devicepsm0at atkbdc? irq 12

 devicevga0at isa?

 pseudo-device splash

 devicesc0 at isa? flags 0x100

 devicenpx0at nexus? port IO_NPX irq 13

 devicesio0at isa? port IO_COM1 flags 0x10 irq 4
 devicesio1at isa? port IO_COM2 irq 3

 devicemiibus  # MII bus support
 devicefxp # Intel 

Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-20 Thread Alfred Perlstein

* The Hermit Hacker [EMAIL PROTECTED] [020420 16:01] wrote:
 
 
 As a quick follow-up to this, doing more searching on the web, I came
 across a few suggested 'sysctl' settings, which I've added to what I had
 before, for a total of:
 
 kern.maxfiles=65534
 jail.sysvipc_allowed=1
 vm.swap_idle_enabled=1
 vfs.vmiodirenable=1
 kern.ipc.somaxconn=4096
 
 I've also just reduced my maxusers to 256 from 1024, since 1024 was
 crashing worse then 512, and I ran across the 'tuning' man page that
 stated that you shouldn't go above 256 :(
 
 Just a bit more detail on the setup ...

You said you're running 5000 processes.  5000 processes of what?

Are they useing SYSVSHM?  If so, this sysctl might help:

kern.ipc.shm_use_phys=1

It'll only work if you set it before your processes setup.

Some more information about what these 5000 processes are doing
would help.

-Alfred

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message



Re: FreeBSD 4.5-STABLE not easily scalable to large servers ... ?

2002-04-20 Thread Marc G. Fournier

On Sat, 20 Apr 2002, Alfred Perlstein wrote:

 * The Hermit Hacker [EMAIL PROTECTED] [020420 16:01] wrote:
 
 
  As a quick follow-up to this, doing more searching on the web, I came
  across a few suggested 'sysctl' settings, which I've added to what I had
  before, for a total of:
 
  kern.maxfiles=65534
  jail.sysvipc_allowed=1
  vm.swap_idle_enabled=1
  vfs.vmiodirenable=1
  kern.ipc.somaxconn=4096
 
  I've also just reduced my maxusers to 256 from 1024, since 1024 was
  crashing worse then 512, and I ran across the 'tuning' man page that
  stated that you shouldn't go above 256 :(
 
  Just a bit more detail on the setup ...

 You said you're running 5000 processes.  5000 processes of what?

 Are they useing SYSVSHM?  If so, this sysctl might help:

 kern.ipc.shm_use_phys=1

Okay, never knew of that one before ... have it set for the next reboot,
as I do have a few postgresql servers going on the 'root (non-jail)'
server ...

 It'll only work if you set it before your processes setup.

 Some more information about what these 5000 processes are doing
 would help.

Sorry ... the server is running ~210 jails ... so the '5k processes' would
be when they all start up their periodic scripts ... normally, it hovers
around 2700 processes ...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-stable in the body of the message