Re: raidframeparity and /etc/defaults/rc.conf

2022-07-03 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>Does someone know of a reason for a setting for the rc.conf
>(rc.d/*) variable raidframeparity to be omitted from rc.defaults/rc.conf ?

>To me that looks like an oversight.

raidframeparity has no rcvar switch, it's always started.



Re: raidframeparity and /etc/defaults/rc.conf

2022-07-03 Thread Robert Elz
Ah, OK, thanks, somehow I missed the absence of the rcvar= line.

kre



Re: pgdaemon high CPU consumption

2022-07-03 Thread Matthias Petermann

Hello,

On 01.07.22 12:48, Brad Spencer wrote:

"J. Hannken-Illjes"  writes:


On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:

Good day,

since some time I noticed that on several of my systems with NetBSD/amd64 
9.99.97/98 after longer usage the kernel process pgdaemon completely claims a 
CPU core for itself, i.e. constantly consumes 100%.
The affected systems do not have a shortage of RAM and the problem does not 
disappear even if all workloads are stopped, and thus no RAM is actually used 
by application processes.

I noticed this especially in connection with accesses to the ZFS set up on the 
respective machines - for example after checkout from the local CVS relic 
hosted on ZFS.

Is there already a known problem or what information would have to be collected 
to get to the bottom of this?

I currently have such a case online, so I would be happy to pull diagnostic 
information this evening/afternoon. At the moment all info I have is from top.

Normal view:

```
  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
0 root 1260 0K   34M CPU/0 102:45   100%   100% [system]
```

Thread view:


```
  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
```


Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls

Last action proposed was to back out the patch ...

--
J. Hannken-Illjes - hann...@mailbox.org



Probably only a slightly related data point, but Ya, if you have a
system / VM / Xen PV that does not have a whole lot of RAM and if you
don't back out that patch your system will become unusable in a very
short order if you do much at all with ZFS (tested with a recent
-current building pkgsrc packages on a Xen PVHVM).  The patch does fix a
real bug, as NetBSD doesn't have the define that it uses, but the effect
of running that code will be needed if you use ZFS at all on a "low" RAM
system.  I personally suspect that the ZFS ARC or some pool is allowed
to consume nearly all available "something" (pools, RAM, etc..) without
limit but have no specific proof (or there is a leak somewhere).  I
mostly run 9.x ZFS right now (which may have other problems), and have
been setting maxvnodes way down for some time.  If I don't do that the
Xen PV will hang itself up after a couple of 'build.sh release' runs
when the source and build artifacts are on ZFS filesets.


Thanks for describing this use case. Apart from the fact that I don't 
currently use Xen on the affected machine, it performs similiar 
workload. I use it as pbulk builder with distfiles, build artifacts and 
CVS / Git mirror stored on ZFS. The builders themself are located in 
chroot sandboxes on FFS. Anyway, I can trigger the observations by doing 
a NetBSD src checkout from ZFS backed CVS to the FFS partition.


The maxvnodes trick first led to pgdaemon behave normal again, but the 
system freezed shortly after with no further evidence.


I am not sure if this thread is the right one for pointing this out, but 
I experienced further issues with NetBSD current and ZFS when I tried to 
perform a recursive "zfs send" of a particular snapshot of my data sets. 
After it initially works, I see the system freeze after a couple of 
seconds with no chance to recover (could not even enter the kernel 
debugger). I will come back and need to prepare a dedicated test VM for 
my cases.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: pgdaemon high CPU consumption

2022-07-03 Thread Brad Spencer
Matthias Petermann  writes:

> Hello,
>
> On 01.07.22 12:48, Brad Spencer wrote:
>> "J. Hannken-Illjes"  writes:
>> 
 On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:

 Good day,

 since some time I noticed that on several of my systems with NetBSD/amd64 
 9.99.97/98 after longer usage the kernel process pgdaemon completely 
 claims a CPU core for itself, i.e. constantly consumes 100%.
 The affected systems do not have a shortage of RAM and the problem does 
 not disappear even if all workloads are stopped, and thus no RAM is 
 actually used by application processes.

 I noticed this especially in connection with accesses to the ZFS set up on 
 the respective machines - for example after checkout from the local CVS 
 relic hosted on ZFS.

 Is there already a known problem or what information would have to be 
 collected to get to the bottom of this?

 I currently have such a case online, so I would be happy to pull 
 diagnostic information this evening/afternoon. At the moment all info I 
 have is from top.

 Normal view:

 ```
   PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
 0 root 1260 0K   34M CPU/0 102:45   100%   100% 
 [system]
 ```

 Thread view:


 ```
   PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
 0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
 ```
>>>
>>> Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls
>>>
>>> Last action proposed was to back out the patch ...
>>>
>>> --
>>> J. Hannken-Illjes - hann...@mailbox.org
>> 
>> 
>> Probably only a slightly related data point, but Ya, if you have a
>> system / VM / Xen PV that does not have a whole lot of RAM and if you
>> don't back out that patch your system will become unusable in a very
>> short order if you do much at all with ZFS (tested with a recent
>> -current building pkgsrc packages on a Xen PVHVM).  The patch does fix a
>> real bug, as NetBSD doesn't have the define that it uses, but the effect
>> of running that code will be needed if you use ZFS at all on a "low" RAM
>> system.  I personally suspect that the ZFS ARC or some pool is allowed
>> to consume nearly all available "something" (pools, RAM, etc..) without
>> limit but have no specific proof (or there is a leak somewhere).  I
>> mostly run 9.x ZFS right now (which may have other problems), and have
>> been setting maxvnodes way down for some time.  If I don't do that the
>> Xen PV will hang itself up after a couple of 'build.sh release' runs
>> when the source and build artifacts are on ZFS filesets.
>
> Thanks for describing this use case. Apart from the fact that I don't 
> currently use Xen on the affected machine, it performs similiar 
> workload. I use it as pbulk builder with distfiles, build artifacts and 
> CVS / Git mirror stored on ZFS. The builders themself are located in 
> chroot sandboxes on FFS. Anyway, I can trigger the observations by doing 
> a NetBSD src checkout from ZFS backed CVS to the FFS partition.
>
> The maxvnodes trick first led to pgdaemon behave normal again, but the 
> system freezed shortly after with no further evidence.
>
> I am not sure if this thread is the right one for pointing this out, but 
> I experienced further issues with NetBSD current and ZFS when I tried to 
> perform a recursive "zfs send" of a particular snapshot of my data sets. 
> After it initially works, I see the system freeze after a couple of 
> seconds with no chance to recover (could not even enter the kernel 
> debugger). I will come back and need to prepare a dedicated test VM for 
> my cases.
>
> Kind regards
> Matthias


I saw something like that with a "zfs send..." and "zfs receive..."
locking up just one time.  I do that sort of thing fairly often to move
filesets between one system and another and it has worked fine for me,
except in one case...  the destination was a NetBSD-current with a ZFS
fileset set to use compression.  The source is a FreeBSD with a ZFS
fileset created in such a manor that NetBSD is happy with it and it also
is set to use compression.  No amount of messing around would let 'zfs
send  | ssh destination "zfs receive "' complete without
locking up the destination.  When I changed the destination to not use
compression I was able to perform the zfs send / receive pipeline
without any problems.  The destination is a pretty recent -current Xen
PVHVM guest and the source is a FreeBSD 12.1 (running minio to back up
my Elasticsearch cluster).



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org



daily CVS update output

2022-07-03 Thread NetBSD source update


Updating src tree:
P src/bin/ksh/expr.c
P src/external/historical/nawk/bin/awk.1
P src/external/mit/xorg/lib/libXaw/Makefile.common
P src/sys/arch/atari/atari/atari_init.c
P src/sys/arch/atari/atari/bus.c
P src/sys/arch/atari/atari/machdep.c
P src/sys/arch/atari/include/bus_funcs.h
P src/sys/arch/atari/include/video.h
P src/sys/arch/atari/stand/installboot/Makefile
P src/sys/arch/atari/stand/installboot/installboot.c
P src/sys/arch/atari/vme/et4000.c
P src/sys/arch/atari/vme/leo.c
P src/sys/arch/evbarm/armadillo/armadillo9_machdep.c
P src/sys/arch/evbarm/g42xxeb/g42xxeb_machdep.c
P src/sys/arch/evbarm/tsarm/tsarm_machdep.c
P src/sys/arch/hp300/dev/sti_sgc.c
P src/sys/arch/hpcmips/dev/ite8181reg.h
P src/sys/arch/hpcsh/dev/hd64461/hd64461video.c
P src/sys/arch/luna68k/dev/lunafb.c
P src/sys/arch/zaurus/zaurus/machdep.c
P src/sys/dev/ieee1394/fwohci.c
P src/sys/dev/pci/if_bge.c
P src/sys/dev/pci/tgareg.h
P src/tests/usr.bin/xlint/lint1/msg_132.c
P src/usr.bin/xlint/common/tyname.c
P src/usr.bin/xlint/lint1/README.md
P src/usr.bin/xlint/lint1/check-msgs.lua
P src/usr.bin/xlint/lint1/debug.c
P src/usr.bin/xlint/lint1/decl.c
P src/usr.bin/xlint/lint1/externs1.h
P src/usr.bin/xlint/lint1/func.c
P src/usr.bin/xlint/lint1/tree.c

Updating xsrc tree:


Killing core files:



Updating release-8 src tree (netbsd-8):

Updating release-8 xsrc tree (netbsd-8):



Updating release-9 src tree (netbsd-9):

Updating release-9 xsrc tree (netbsd-9):




Updating file list:
-rw-rw-r--  1 srcmastr  netbsd  39280962 Jul  4 03:09 ls-lRA.gz