macppc system wedging under memory pressure

2022-09-08 Thread Havard Eidnes
Hi,

I'm running NetBSD-current on one of my 1G Mac Mini G4 systems,
doing pkgsrc bulk building.

This go-around I've managed to build llvm, and next up is rust.  This
is proving to be difficult -- my system will consistently wedge it's
user-land (still responding to ping, no response on the console or any
ongoing ssh sessions; well, not entirely correct, it will echo one
carriage-return on the console with a newline, but then that is wedged
as well).  Also, I have still not managed to break into DDB on this
system, so each and every time I have to power-cycle the box.  This
also means that all I have to go on is output from "top -s 1", "vmstat
1" and "systat vm", and this is the latest information I got from
these programs when it wedged just now:

load averages:  1.10,  1.13,  1.05;   up 0+02:01:4521:59:52
103 threads: 5 idle, 6 runnable, 90 sleeping, 1 zombie, 1 on CPU
CPU states:  1.0% user,  5.9% nice, 93.1% system,  0.0% interrupt,  0.0% idle
Memory: 559M Act, 274M Inact, 12M Wired, 186M Exec, 162M File, 36K Free
Swap: 3026M Total, 80M Used, 2951M Free / Pools: 134M Used

  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
 6376 26281 1138  78 RUN 2:03 89.10% 88.96% rustc rustc
0   109 root 126 pgdaemon0:20 15.48% 15.48% pgdaemon  [system]
  733   733 he85 poll0:14  2.93%  2.93% - sshd
  164   164 he85 RUN 0:06  1.17%  1.17% - systat

Notice the rather small amount of "Free" memory, and the rather
high rate of system CPU.  The "vmstat 1" output for the last few
seconds:

 procsmemory  page   disk faults  cpu
 r b  avmfre  flt  re  pi   po   fr   sr w0   in   sy  cs us sy id
 1 0   634804   4164 1869   0   00 1358 1358  0  2800 425 97  3  0
 3 0   637876   1016  786   0   0000  0  2130 410 99  1  0
 2 0   636336   2512  816   4   00 1192 1202  0  3260 508 98  2  0
 2 0   633448   5456  617   0   00 1355 1371  0  2280 374 99  1  0
 2 0   634964   3780  430   0   0000  0  2500 452 98  2  0
 2 0   635988   2740  260   0   0000  0  2610 496 98  2  0
 2 0   637396   1376  386   0   0000  0  3000 459 97  3  0
 2 0   634912   4060  775   0   00 1354 1354  0  1900 245 100 0 0
 2 0   636940   2308  437   0   0000  0  2500 415 100 0 0
 2 0   637912   1064  473   0   0000  0  2510 406 100 0 0
 2 0   633580   5408  175   0   00 1262 1270  0  2540 403 99  1  0
 2 0   637288   1740 1002   0   0000  0  2780 521 97  3  0
 2 0   634340   4324  713   0   00 1354 1357  0  2960 471 96  4  0
 2 0   636388   2160  540   0   0000  0  2160 361 98  2  0
 2 0   637412   1116  258   0   0000  0  2540 405 98  2  0
 2 0   637556   4872  178  12   0  996 1122 42861  4  3070 442 30 70  0
 2 0   638064   9620 1105   3   0 1228 1228 2305 70  4110 667 19 81  0
 2 0   639624   7416  550   0   0000  0  3190 584 97  3  0
 2 0   644744   2200 1299   0   0000  0  2790 416 93  7  0
 6 0   646924   2716  537   0   0 1356  672 2403 14  4120 497 35 65  0
 4 0   654792 36 2022  32   0 1354 1366 7910 91  2410 6735 7 93  0

while "systat vm" doesn't really give any more information than
the above:

6 usersLoad  1.10  1.13  1.05  Thu Sep  8 21:59:51

Proc:r  d  sCsw  Traps SysCal  Intr   Soft  Fault PAGING   SWAPPING
 8 3355471  302 75398 in  out   in  out
ops64
  68.2% Sy   0.0% Us  31.8% Ni   0.0% In   0.0% Idpages  1027
|||||||||||
==forks
  fkppw
Anon   509096  50%   zero472 Interrupts   fksvm
Exec   190804  18%   wired   12000   100 cpu0 clock   pwait
File   166072  16%   inact  280984   openpic irq 29   relck
Meta82832   2%   bufs 6500   openpic irq 63   rlkok
 (kB)real   swaponly  free38 openpic irq 39 1 noram
Active 570368  73812  2716   openpic irq 4011 ndcpy
Namei Sys-cache Proc-cache   167 openpic irq 41   fltcp
Calls hits% hits %   167 gem0 interrupts  397 zfod
66  100   cow
  256 fmin
  Disks: cd0 wd0  341 ftarg
 seeksitarg
 xfers14

Re: macppc system wedging under memory pressure

2022-09-09 Thread Havard Eidnes
Well,

following up on my own posting of yesterday evening.

There's good and not so good news: the good news is that my G4
Mac Mini running -current finally managed to build rust-1.62.1
from pkgsrc-current (using llvm from pkgsrc, not the internal
one).  The bad news is that I don't have a definitive explanation
of what caused my earlier problems, even though I'm pretty sure
it was VM-related.

Based at least partially on suggestions from fellow NetBSD
developers, I've made the following adjustments to this host's
setup:

Reduced kern.maxvnodes from the default of around 55000 to 1
(that's from an earlier similar experience from the i386 port).

I had earlier added 1GB swap space as a file which I removed as
swap space using swapctl.  Swap space ties up some part of
physical memory to keep track of the swap space.  (I already had
a 2GB swap partition which is sufficient, it turns out.)

I made the following adjustments to vm settings:

vm.filemax=20  (down from 50)
vm.filemin=5   (down from 10)
vm.execmin=5   (hm, already at 5?)
vm.anonmax=50  (down from 80)
vm.anonmin=5   (down from 10)

Apparently the *min values is what made the difference, I beleive
I made the *max adjustments earlier without succeeding.  Ref.
info in the sysctl(7) man page.

Regards,

- HÃ¥vard


Re: macppc system wedging under memory pressure

2022-09-15 Thread Lloyd Parkes
You aren't the first person to have problems with memory pressure. We 
really are going to have to get around to documenting the memory 
management algorithms and all the tuning knobs.


I used to use this page (https://imil.net/NetBSD/mirror/vm_tune.html), 
but I have no idea how current it is. Also, I haven't used my smaller 
systems for a while now.


In the past, I used to set vm.filemax to 5 because I never want a page 
that I can simply reread to force an anonymous page to be written out to 
swap.


Cheers,
Lloyd

On 9/09/22 08:30, Havard Eidnes wrote:

Hi,

I'm running NetBSD-current on one of my 1G Mac Mini G4 systems,
doing pkgsrc bulk building.

This go-around I've managed to build llvm, and next up is rust.  This
is proving to be difficult -- my system will consistently wedge it's
user-land (still responding to ping, no response on the console or any
ongoing ssh sessions; well, not entirely correct, it will echo one
carriage-return on the console with a newline, but then that is wedged
as well).  Also, I have still not managed to break into DDB on this
system, so each and every time I have to power-cycle the box.  This
also means that all I have to go on is output from "top -s 1", "vmstat
1" and "systat vm", and this is the latest information I got from
these programs when it wedged just now:

load averages:  1.10,  1.13,  1.05;   up 0+02:01:4521:59:52
103 threads: 5 idle, 6 runnable, 90 sleeping, 1 zombie, 1 on CPU
CPU states:  1.0% user,  5.9% nice, 93.1% system,  0.0% interrupt,  0.0% idle
Memory: 559M Act, 274M Inact, 12M Wired, 186M Exec, 162M File, 36K Free
Swap: 3026M Total, 80M Used, 2951M Free / Pools: 134M Used

   PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
  6376 26281 1138  78 RUN 2:03 89.10% 88.96% rustc rustc
 0   109 root 126 pgdaemon0:20 15.48% 15.48% pgdaemon  [system]
   733   733 he85 poll0:14  2.93%  2.93% - sshd
   164   164 he85 RUN 0:06  1.17%  1.17% - systat

Notice the rather small amount of "Free" memory, and the rather
high rate of system CPU.  The "vmstat 1" output for the last few
seconds:

  procsmemory  page   disk faults  cpu
  r b  avmfre  flt  re  pi   po   fr   sr w0   in   sy  cs us sy id
  1 0   634804   4164 1869   0   00 1358 1358  0  2800 425 97  3  0
  3 0   637876   1016  786   0   0000  0  2130 410 99  1  0
  2 0   636336   2512  816   4   00 1192 1202  0  3260 508 98  2  0
  2 0   633448   5456  617   0   00 1355 1371  0  2280 374 99  1  0
  2 0   634964   3780  430   0   0000  0  2500 452 98  2  0
  2 0   635988   2740  260   0   0000  0  2610 496 98  2  0
  2 0   637396   1376  386   0   0000  0  3000 459 97  3  0
  2 0   634912   4060  775   0   00 1354 1354  0  1900 245 100 0 0
  2 0   636940   2308  437   0   0000  0  2500 415 100 0 0
  2 0   637912   1064  473   0   0000  0  2510 406 100 0 0
  2 0   633580   5408  175   0   00 1262 1270  0  2540 403 99  1  0
  2 0   637288   1740 1002   0   0000  0  2780 521 97  3  0
  2 0   634340   4324  713   0   00 1354 1357  0  2960 471 96  4  0
  2 0   636388   2160  540   0   0000  0  2160 361 98  2  0
  2 0   637412   1116  258   0   0000  0  2540 405 98  2  0
  2 0   637556   4872  178  12   0  996 1122 42861  4  3070 442 30 70  0
  2 0   638064   9620 1105   3   0 1228 1228 2305 70  4110 667 19 81  0
  2 0   639624   7416  550   0   0000  0  3190 584 97  3  0
  2 0   644744   2200 1299   0   0000  0  2790 416 93  7  0
  6 0   646924   2716  537   0   0 1356  672 2403 14  4120 497 35 65  0
  4 0   654792 36 2022  32   0 1354 1366 7910 91  2410 6735 7 93  0

while "systat vm" doesn't really give any more information than
the above:

 6 usersLoad  1.10  1.13  1.05  Thu Sep  8 21:59:51

Proc:r  d  sCsw  Traps SysCal  Intr   Soft  Fault PAGING   SWAPPING
  8 3355471  302 75398 in  out   in  out
 ops64
   68.2% Sy   0.0% Us  31.8% Ni   0.0% In   0.0% Idpages  1027
|||||||||||
==forks
   fkppw
Anon   509096  50%   zero472 Interrupts   fksvm
Exec   190804  18%   wired   12000   100 cpu0 clock   pwait
File   166072  16%   inact  280984   openpic irq 29   relck
Meta82832   2%   bufs 6500   openpic irq 63   rlkok
  (kB)real   swaponly  free38 openpic irq 39 1 no

Re: macppc system wedging under memory pressure

2022-09-16 Thread Mike Pumford




On 16/09/2022 06:14, Lloyd Parkes wrote:
You aren't the first person to have problems with memory pressure. We 
really are going to have to get around to documenting the memory 
management algorithms and all the tuning knobs.


I used to use this page (https://imil.net/NetBSD/mirror/vm_tune.html), 
but I have no idea how current it is. Also, I haven't used my smaller 
systems for a while now.


In the past, I used to set vm.filemax to 5 because I never want a page 
that I can simply reread to force an anonymous page to be written out to 
swap.


I've been running my build system ( an 8 core amd64 system with 16GB of 
RAM) with:


vm.filemax=10
vm.filemin=1

So its not just SMALL systems that need better tuning.

Before I set those I found that the system would prioritise file cache 
so much that any large process that ran for a long time would be forced 
to swap out so much that it would then take them ages to recover. In my 
case that was the jenkins process that was managing the build leading to 
lots of failed builds as the jenkins process fell apart. Setting those 
limits meant the file cache got evicted instead of the jenkins process.


I also found the same settings kept things like firefox from getting 
swapped out during builds as well.


This is all on 9.3 stable and all alther vm.xxx setting are at their 
default.


Mike


Re: macppc system wedging under memory pressure

2022-09-16 Thread Michael
Hello,

On Fri, 16 Sep 2022 19:41:44 +0100
Mike Pumford  wrote:

> I've been running my build system ( an 8 core amd64 system with 16GB of 
> RAM) with:
> 
> vm.filemax=10
> vm.filemin=1
> 
> So its not just SMALL systems that need better tuning.
> 
> Before I set those I found that the system would prioritise file cache 
> so much that any large process that ran for a long time would be forced 
> to swap out so much that it would then take them ages to recover. In my 
> case that was the jenkins process that was managing the build leading to 
> lots of failed builds as the jenkins process fell apart. Setting those 
> limits meant the file cache got evicted instead of the jenkins process.
> 
> I also found the same settings kept things like firefox from getting 
> swapped out during builds as well.

I've seen the same thing on a sparc64 with 12GB RAM - firefox and claws
would get swapped out while the buffer cache would stay at 8GB or more,
with a couple cc1plus instances fighting over the remaining RAM.

have fun
Michael


Re: macppc system wedging under memory pressure

2022-09-16 Thread Paul Ripke
On Fri, Sep 16, 2022 at 08:02:07PM -0400, Michael wrote:
> Hello,
> 
> On Fri, 16 Sep 2022 19:41:44 +0100
> Mike Pumford  wrote:
> 
> > I've been running my build system ( an 8 core amd64 system with 16GB of 
> > RAM) with:
> > 
> > vm.filemax=10
> > vm.filemin=1
> > 
> > So its not just SMALL systems that need better tuning.
> > 
> > Before I set those I found that the system would prioritise file cache 
> > so much that any large process that ran for a long time would be forced 
> > to swap out so much that it would then take them ages to recover. In my 
> > case that was the jenkins process that was managing the build leading to 
> > lots of failed builds as the jenkins process fell apart. Setting those 
> > limits meant the file cache got evicted instead of the jenkins process.
> > 
> > I also found the same settings kept things like firefox from getting 
> > swapped out during builds as well.
> 
> I've seen the same thing on a sparc64 with 12GB RAM - firefox and claws
> would get swapped out while the buffer cache would stay at 8GB or more,
> with a couple cc1plus instances fighting over the remaining RAM.

Yup; my amd64 16GiB general purpose system tends to run some RAM heavy
apps (builds, java, firefox, blender, prusaslicer, ...), so I've had these
tweaks for many years:

vm.anonmin=50
vm.filemin=5
vm.filemax=15

Cheers,
-- 
Paul Ripke
"Great minds discuss ideas, average minds discuss events, small minds
 discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.