At Thu, 17 Jul 2025 18:09:15 -0700, Brian Buhrow <[email protected]> wrote: Subject: Re: Processes getting stuck in "fstchg" with NetBSD-10.99.12/amd64 > > Hello. Thanks for the reply. I'm still trying to work out what's > going on. It may very > well be a memory shortage issue, but I'm thinking it's either some kind of > memory fragmentation > issue or a network related problem. The issue appears to be triggered when > ssh sessions are > uncleanly terminated. Specifically, when dangling connections are left > hanging by stateful > firewalls which timeout between client and server, causing the server side to > shutdown > uncleanly. What appears to happen is that something gets hung up, a bunch of > processes start, > things get stuck in fstchg and everything hangs, though the kernel doesn't > crash. Sometimes I > see proc table full messages, but not always. > The next time it happens I'll call fstrans_dump from ddb to see if that > yields any results, but > right now, I'm at a loss as to which process it is that gets stuck initially, > causing the > pileup. And, while I am pretty sure I know what triggers the problem, I > haven't quite figured > out how to reproduce it at will. > Anyone seen anything like this? > This is on amd64, NetBSD-10.99.12 on a xen VM with 2 processors. > I have a bunch of other machines, both VM's and bare metal, running the same > code without > trouble.
So, since this is happening in a specific VM, is it not easy to adjust
the amount of memory allocated to it to see if that does make it more
likely (with less memory) or less likely (with more memory) for the
hangs to happen? Also is it possible to clone the VM and run tests in
the clone with more/less RAM allocated?
--
Greg A. Woods <[email protected]>
Kelowna, BC +1 250 762-7675 RoboHack <[email protected]>
Planix, Inc. <[email protected]> Avoncote Farms <[email protected]>
pgpRzFPOD3XUb.pgp
Description: OpenPGP Digital Signature
