On 2012-12-18, Marcin <mig...@gmail.com> wrote: > Today a member of my 2 machines firewall cluster running 5.2 panicked > with following info (screenshot at http://tinypic.com/r/11t7nrl/6): > > panic: pmap_remove_ptes: unmanaged page marked PG_PVLIST, va = > 0x3c005000, pa = 0xfffff000 > > The machine, along with its identical twin, runs a standard suite of: > PF (including carp and pfsync), relayd and bgpd. > It is the 5th panic since the cluster was commisioned over a week ago, > all of them happened in the same function pmap_remove_ptes. > > I found an older thread with Stuart reporting similar issue here > http://marc.info/?l=openbsd-tech&m=132593610913252
Frequent is kind-of good ;) I had a few crashes close together but then nothing (and I've moved most of those boxes to amd64 by now). It was suggested that I run with kern.pool_debug=1 (which will be disabled by default on release kernels) and try the "slow recycle" diff, I do not have a copy of that diff any more but somebody reading might do. Really you'll want some way to log DDB output (serial console preferably, unless you are lucky and the dmesg buffer survives a reboot) and at least run "show all pools" as well as the usual trace / ps. If you have a crash dump (look in /var/crash) that may possibly be of use to someone too.