I've raised the issue of mlock at the httpd-dev list, Scott Hess 
followed up with extensive explanations and most important -- a C code 
to verify that memory doesn't go unshared when swapped out. The cool 
thing about it being in C is that it's easy to create big chunks of 
shared memory. So now you can easily verify the effects discussed at 
this list recently.

The conclusing is the same as we always had: don't let your machine 
swap, use the memory usage restricting tools for maintaining the desired 
memory usage cap.

-------- Original Message --------
Subject: Re: performance: using mlock(2) on httpd parent process
Date: Wed, 20 Mar 2002 11:08:51 -0800 (PST)
From: Scott Hess <[EMAIL PROTECTED]>
To: Stas Bekman <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]

On Thu, 21 Mar 2002, Stas Bekman wrote:
 > > On Wed, 20 Mar 2002, Stas Bekman wrote:
 > >
 > >>mod_perl child processes save a lot of memory when they can share
 > >>memory with the parent process and quite often we get reports from
 > >>people that they lose that shared memory when the system decides to
 > >>page out the parent's memory pages because they are LRU (least
 > >>recently used, the algorithm used by many memory managers).
 > >>
 > >
 > > I'm fairly certain that this is not an issue.  If a page was shared
 > > COW before being paged out, I expect it will be shared COW when paged
 > > back in, at least for any modern OS.
 >
 > But if the system needs to page things out, most of the parent process's
 > pages will be scheduled to go first, no? So we are talking about a
 > constant page-in/page-out from/to the parent process as a performance
 > degradation rather than memory unsharing. Am I correct?

The system is going to page out an approximation of the
least-recently-used pages.  If the children are using those pages, then
they won't be paged out, regardless of what the parent is doing.  [If the
children _aren't_ using those pages, then who cares?]

 > > [To verify that I wasn't talking through my hat, here, I just verified
 > > this using RedHat 7.2 running kernel 2.4.9-21.  If you're 
interested in my
 > > methodology, drop me an email.]
 >
 > I suppose that this could vary from one kernel version to another.

Perhaps, but I doubt it.  I can't really do real tests on older kernels
because I don't have them on any machines I control, but I'd be somewhat
surprised if any OS which runs on modern hardware worked this way.  It
would require the OS to map a given page to multiple places in the
swapfile, which would be significant extra work, and I can't think of any
gains from doing so.

 > I'm just repeating the reports posted to the mod_perl list. I've never
 > seen such a problem myself, since I try hard to have close to zero swap
 > usage.

:-).  In my experience, you can get some really weird stuff happening when
you start swapping mod_perl.  It seems to be stable in memory usage,
though, so long as you have MaxClients set low enough that your maximum
amount of committed memory is appropriate.  Also, I've seen people run
other heavyweight processes, like mysql, on the same system, so that when
the volume spikes, mod_perl spikes AND mysql spikes.  A sure recipe for
disaster.

 > [Yes, please let me know your methodology for testing this]

OK, two programs.  bigshare.c:

#include <stdlib.h>
#include <signal.h>
#include <unistd.h>

#define MEGS 256
static char *mem = NULL;
static char vv = 0;

static void handler(int signo)
{
     char val = 0;
     unsigned ii;
     signal(signo, handler);
     for (ii=0; ii<MEGS*1024*1024; ii+=4096) {
         val += mem[ii];
     }
     vv = val;
}

int main(int argc, char **argv)
{
     mem = calloc(1, MEGS*1024*1024);

     fork();
     fork();
     signal(SIGUSR1, handler);

     while(1) {
         sleep(1000);
     }
     return 0;
}



and makeitswap.c:

#include <stdlib.h>

int main(int argc, char **argv)
{
     char *mem = calloc(1, 384*1024*1024);
     free(mem);
     return 0;
}

These both compile under RedHat 7.2, you might have to adjust the #include
directives for other systems.  Adjust the MEGS value in bigshare.c to be
big enough to matter, but not so big that it causes bigshare itself to
swap.  I chose 1/2 of my real memory size.  The 384 in makeitswap.c is 3/4
of my real memory, so it pushes tons of stuff into swap.

Run bigshare.  Use ps or something appropriate to determine that, indeed,
all four bigshare processes are using up 256M of memory, but it's all
shared.

Then, run makeitswap.  All of the bigshare processes should partly or
fully page out.  Afterwards I I was seeing RSS from 260k to 1M on the
bigshare processes.

Then, kill -USR1 one of the bigswap processes.  This causes the process to
re-read all of the memory it earlier allocated, thus it should page in
256M or so.  ps or top should show the RSS rising as it swaps back in.
You can also use "vmstat 1" to watch it happen (watch the Swap/si column).
On some systems you may need to use iostat.  More than likely your system
response also goes to heck, because it's spending so much time swapping
data in.  bigswap should end up with RSS about 256M, again.

Then, kill -USR1 another of the bigswap processes.  On my system, this
happened much faster than the first time.  Also, I saw only minimal
swapins in vmstat (128 or so per second, versus >10,000 per second for the
-USR1 against the first process).  Send -USR1 to other bigshare processes,
same results.  You can verify that the pages are shared with ps or
whatever.

 > >>Therefore my question is there any reason for not using mlockall(2) in
 > >>the parent process on systems that support it and when the parent
 > >>httpd is started as root (mlock* works only within root owned
 > >>processes).
 > >
 > > I don't think mlockall is appropriate for something with the heft of
 > > mod_perl.
 > >
 > > Why are the pages being swapped out in the first place?  Presumably
 > > there's a valid reason.
 >
 > Well, the system coming close to zero of real memory available. The
 > parent process starts swapping like crazy because most of its pages are
 > LRU, slowing the whole system down and if the load doesn't go away, the
 > system takes a whirl down to a halt.

I can think of a couple possible causes.  One is the MaxClients setting.
Say you have MaxClients set to 50, but on a standard day you never need
more than 15 servers.  Then you get listed on slashdot.  As you pass, oh,
30 simultaneous servers, you start thrashing, so requests take longer to
process, so you immediately spike to your MaxClients of 50 and the server
goes right down the drain.  If you shut things down and start them back
up, it's likely you'll immediately spike to 50 again, and back down the
drain it goes.

I've found that it's _super_ important to make certain you've pushed
mod_perl to MaxClients under your standard conditions.  Once you start
swapping, you're doomed, unless the traffic was a very short spike.

Another possible cause is that you have another heavyweight server running
on the same server.  As I indicated above, I've seen people do this with
things like mysql.  Since high mod_perl traffic implies high mysql
traffic, it's just like having MaxClients set too high, but twice as bad!

Another possible cause is that the OS is aggressively grabbing pages for
the filesystem cache.  It's possible that tuning down the size of the
filesystem cache would be appropriate - many webservers have a very large
maximum amount of data they might server, but a very small working set.

Really, though, all of this really tends to come down to MaxClients.  The
database load is proportional to the number of clients, the filesystem
load is proportional to the number of clients, everything is proportional
to the number of clients.  MaxClients is as close to a silver bullet as
I've seen.  People tend to set MaxClients based on their expected load,
rather than on how much load their server can handle. You just have to
arrange for your maximum committed memory usage to appropriately reflect
the memory available, or you're doomed, there's nothing the OS can do to
help.

 > > Doing mlockall on your mod_perl would result in restricting the memory
 > > available to the rest of the system.  Whatever is causing mod_perl to
 > > page out would then start thrashing.  Worse, since mlockall will lock
 > > down mod_perl pages indiscriminately, the resulting thrashing will
 > > probably be even worse than what they're seeing right now.
 >
 > Possible, I've never tried this myself and therefore asked. Someone has
 > suggested using P_SYSTEM flag which is supposed to tell the system not
 > to page out certain pages, but seems to be a *BSD thingy.

Really, the problem is that it's very hard to figure out which pages of
mod_perl really need this treatment.  Heck, it's very hard with any
program, but with mod_perl you have to deal with the obfuscating Perl
virtual machine layer.  That's pretty tough.  If you could just lock down
whatever is needed to keep running, that would be great...

Later,
scott

-- 


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/

Reply via email to