zfs pool behavior - is it ever freed?

Greg Troxel Fri, 21 Jul 2023 05:32:12 -0700

I'm having trouble with zfs causing a system to run out of memory, when
I think it should work ok.  I have tried to err on the side of TMI.


I have a semi-old computer (2010) that is:
  netbsd-10
  amd64
  8GB RAM
  1T SSD
  cpu0: "Pentium(R) Dual-Core  CPU      E5700  @ 3.00GHz"
  cpu1: "Pentium(R) Dual-Core  CPU      E5700  @ 3.00GHz"

and it basically works fine, besides being a bit slow by today's
standards.  I am using it as a build and fileserver, heading to
eventually running pbulk, either in domUs or chroots.  I have recently
moved 2 physical machines (netbsd-9 i386 and amd64) to domUs; I use
these to build packages for production use.  (The machines are 2006 and
2008 mac notebooks, with painfully slow spinning disks and 4G of RAM
each -- but they work.)

wd0 has a disklabel, with / and /usr as normal FFSv2 (a and e), normal
swap on wd0b.  wd0f is defined as most of the disk, and is the sole
component of tank0:

  #> zpool status
    pool: tank0
   state: ONLINE
    scan: scrub repaired 0 in 0h8m with 0 errors on Tue Jul  4 20:31:03 2023
  config:

          NAME                   STATE     READ WRITE CKSUM
          tank0                  ONLINE       0     0     0
            /etc/zfs/tank0/wd0f  ONLINE       0     0     0

  errors: No known data errors

I have a bunch of filesystems, for various pkgsrc branches (created from
snapshots), etc:

  NAME                   USED  AVAIL  REFER  MOUNTPOINT
  tank0                  138G   699G    26K  /tank0
  tank0/b0              6.16G   699G  6.16G  /tank0/b0
  tank0/ccache          24.1G   699G  24.1G  /tank0/ccache
  tank0/distfiles       35.1G   699G  35.1G  /tank0/distfiles
  tank0/n0              31.5K   699G  31.5K  /tank0/n0
  tank0/obj             3.48G   699G  3.48G  /tank0/obj
  tank0/packages        7.27G   699G  7.27G  /tank0/packages
  tank0/pkgsrc-2022Q1    130M   699G   567M  /tank0/pkgsrc-2022Q1
  tank0/pkgsrc-2022Q2    145M   699G   569M  /tank0/pkgsrc-2022Q2
  tank0/pkgsrc-2022Q3    194M   699G   566M  /tank0/pkgsrc-2022Q3
  tank0/pkgsrc-2022Q4    130M   699G   573M  /tank0/pkgsrc-2022Q4
  tank0/pkgsrc-2023Q1    147M   699G   582M  /tank0/pkgsrc-2023Q1
  tank0/pkgsrc-2023Q2    148M   699G   583M  /tank0/pkgsrc-2023Q2
  tank0/pkgsrc-current  10.3G   699G  1.14G  /tank0/pkgsrc-current
  tank0/pkgsrc-wip       623M   699G   623M  /tank0/pkgsrc-wip
  tank0/u0              1.91M   699G  1.91M  /tank0/u0
  tank0/vm              49.5G   699G    23K  /tank0/vm
  tank0/vm/n9-amd64     33.0G   722G  10.1G  -
  tank0/vm/n9-i386      16.5G   711G  4.38G  -
  tank0/ztmp             121M   699G   121M  /tank0/ztmp

which all feels normal to me.


I used to usually boot this as GENERIC.  Now I'm booting xen with 4G:

  menu=GENERIC:rndseed /var/db/entropy-file;boot netbsd
  menu=GENERIC single user:rndseed /var/db/entropy-file;boot netbsd -s
  menu=Xen:load /netbsd-XEN3_DOM0.gz root=wd0a rndseed=/var/db/entropy-file 
console=pc;multiboot /xen.gz dom0_mem=4096M
  menu=Xen single user:load /netbsd-XEN3_DOM0.gz root=wd0a 
rndseed=/var/db/entropy-file console=pc -s;multiboot /xen.gz dom0_mem=4096M
  menu=GENERIC.ok:rndseed /var/db/entropy-file;boot netbsd
  menu=Drop to boot prompt:prompt
  default=3
  timeout=5
  clear=1

I find that after doing things like cvs update in pkgsrc, I have a vast
amount of memory in pools:

  Memory: 629M Act, 341M Inact, 16M Wired, 43M Exec, 739M File, 66M Free
  Swap: 16G Total, 16G Free / Pools: 3372M Used

vmstat -m, sorted by Npage and showing > 1E4:

  zio_buf_16384 16384 57643    1    53341 33786 22341 11445 30831     0   inf 
7143
  zio_buf_2560 2560   18636    0    17890 15244  2467 12777 12777     0   inf 
12031
  ffsdino2     264   540607    0   348374 28691 15875 12816 13522     0   inf   
 0
  zfs_znode_cache 248 245152   0   206469 13015    18 12997 13015     0   inf  
665
  ffsino       280   540249    0   348016 30887 17156 13731 14488     0   inf   
 0
  zio_buf_2048 2048   36944    0    36004 15617   599 15018 15026     0   inf 
14259
  zio_buf_1536 2048   41491    0    40737 18313     6 18307 18313     0   inf 
17657
  zio_buf_1024 1536   55808    0    54191 22942   357 22585 22942     0   inf 
21442
  dmu_buf_impl_t 216 538828    0   440673 23016    11 23005 23016     0   inf  
380
  arc_buf_hdr_t_f 208 657474   0   556468 25273   638 24635 25096     0   inf 
7913
  zio_data_buf_51 1024 187177  0   157005 45575 14127 31448 45575     0   inf 
10220
  vcachepl     640   266639    0    56918 34959     2 34957 34958     0   inf   
 1
  dnode_t      640   576198    0   485522 70645  9470 61175 70645     0   inf 
11511
  zio_buf_512 1024   848240    0   798838 141743 15535 126208 128224  0   inf 
96759
  Memory resource pool statistics
  Name        Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg 
Idle

systcl:

  kstat.zfs.misc.arcstats.size = 283598992

If I continue to do things, the system locks up and needs to have the
reset button pushed.  I'm now trying an external tickle watchdog with a
srript that does sync/tickle/sleep-60, on the hopes that it will reboot
when sync starts hanging up.  My memory is that this happens with
non-xen too, but it takes longer.

Other than the lockups, zfs behaves as I expect it to.

So what I don't understand is:

  Is there any mechanism to cause zfs (guessing ARC) to limit the amount
  of memory in use?

  I there any mechanism to cause zfs to free ARC during memory pressure?

  Do people think this is a xen/zfs interaction bug, that doesn't happen
  in non-xen?

Basically especially with and SSD, ARC is not such a win, and ARC causing
the machine to run out of memory is dysfunctional.

questions:

  Have I misconfigured/mis-used zfs?

  Is there really no reclaiming under pressur?

  Is there some way to limit ARC to say 1 GB?

  Why isn't x% of memory a default limit, if there's no functioning
  reclaim under memory pressure?

  Are others having this problem?


Greg

zfs pool behavior - is it ever freed?

Reply via email to