On Tuesday 02 February 2010 22:58:13 Max Kanat-Alexander wrote:
> All of my processes kept exiting with a report that they had a 300M
> unshared size, which was clearly untrue, even from looking at top. After
> some investigation, I discovered that Apache2::SizeLimit was calling
> $s->size on the Linux::Smaps object, when instead it should be returning
> $s->rss as the process size.
> 
Well, I tend to disagree. (Fred, Adam please read on.)

The /proc/PID/statm based check returns the fields 0 and 2. According to the 
following table from KERNEL/Documentation/filesystems/proc.txt field 0 is SIZE 
and not RSS.

Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
..............................................................................
 Field    Content
 size     total program size (pages)            (same as VmSize in status)
 resident size of memory portions (pages)       (same as VmRSS in status)
 shared   number of pages that are shared       (i.e. backed by a file)
 trs      number of pages that are 'code'       (not including libs; broken,
                                                        includes data segment)
 lrs      number of pages of library            (always 0 on 2.6)
 drs      number of pages of data/stack         (including libs; broken,
                                                        includes library text)
 dt       number of dirty pages                 (always 0 on 2.6)

This is also consistent with (Smaps prints kb while statm shows pages, hence 
the division by 4):

$ perl -MLinux::Smaps -le 'print Linux::Smaps->new(shift)->size/4' $$
3526
$ cat /proc/$$/statm
3525 881 396 144 0 495 0

So, either the far older statm technique is also wrong or the patch is wrong.

But on Solaris we do "-s /proc/self/as". That is the size of the address space 
of the process. Are we wrong there, as well?

Ok, on BSD it seems to be RSS:

# rss is in KB but ixrss is in BYTES.
# This is true on at least FreeBSD, OpenBSD, & NetBSD
sub _bsd_size_check {

    my @results = BSD::Resource::getrusage();
    my $max_rss   = $results[2];
    my $max_ixrss = int ( $results[3] / 1024 );

    return ($max_rss, $max_ixrss);
}

About the windows code I cannot say anything. What does 
"$peak_working_set_size" mean? For me the wording seems a bit similar to 
max(resident segment size).

So, we have at least 2 different meanings of the SIZE result. On BSD it is RSS 
on Solaris and Linux SIZE. What is correct?

Let's see how it is used?

    my ($size, $share, $unshared) = $class->_check_size();

    return 1 if $MAX_PROCESS_SIZE  && $size > $MAX_PROCESS_SIZE;

    return 0 unless $share;

    return 1 if $MIN_SHARE_SIZE    && $share < $MIN_SHARE_SIZE;

    return 1 if $MAX_UNSHARED_SIZE && $unshared > $MAX_UNSHARED_SIZE;

It is compared with $MAX_PROCESS_SIZE and there is no $MAX_PROCESS_RSS.

And what does the docs say?

=item * Apache2::SizeLimit->set_max_process_size($size)

This sets the maximum size of the process, including both shared and
unshared memory.

It talks about process size not RSS, again.

Now let's assume we would check RSS instead of SIZE.

When a new apache worker process is created its growth in SIZE depends on how 
much memory it allocates additionally over time. But its RSS depends upon the 
process' size and what part of it is swapped out. It seems to me that we want 
to kill a worker if its size grows bigger than the initial worker size plus a 
certain amount it is allowed to grow. If we would check RSS then a worker or 
even all workers could be killed because some administrator does

  swapoff /dev/...

and suddenly all pages that were swapped to this device are copied into RAM 
and added to the worker's RSS. I think that would be wrong.


But you mentioned "unshared size". How is it calculated? 
Apache::SizeLimit::Core::_check_size() simply does $size-$share. On Linux this 
is wrong. When you read /proc/PID/smaps the kernel walks through all pages 
that belong to the process and does:

  size+=pagesize;
  if( page_is_in_RAM ) {
    rss+=pagesize;
    if( reference_count>1 ) {  /* how many processes map that page */
      shared+=pagesize;
    } else {
      private+=pagesize;
    }
  }

When a page is not in RAM there is unfortunately no way to check the reference 
count other than swap it in. And that is certainly too high a cost.

Currently Apache2::SizeLimit assumes that everything that is not in core is 
not shared. This is certainly wrong.

Perhaps Apache2::SizeLimit::Core should read:

sub _check_size {
    my $class = shift;

    my ($size, $share, $unshared) = $class->_platform_check_size();

    return ($size, $share, defined $unshared ? $unshared : $size - $share);
}

sub _linux_smaps_size_check {
    my $class = shift;

    return $class->_linux_size_check() unless $USE_SMAPS;

    my $s = Linux::Smaps->new($$)->all;
    return ($s->size,
            $s->shared_clean + $s->shared_dirty,
            $s->private_clean + $s->private_dirty);
}

That would be more what you want, I think.

Torsten

Reply via email to