Re: Kernel Panic
Hi, there is a function called pf_get_sport in /usr/src/sys/netpfil/pf/pf_lb.c which contains a do while loop, the guard is ! PF_AEQ(_addr, naddr, af)). We put a counter in this loop and we saw it spin 431728 times, this appears to coincide with a lockup. we'll continue investigating tomorrow. Regards Joe Jones From: Kristof ProvostSent: 01 March 2018 09:57:18 To: Joe Jones Cc: freebsd-pf@freebsd.org Subject: Re: Kernel Panic On 1 Mar 2018, at 15:37, Joe Jones wrote: > yes we use pfsync. Yesterday we tried with pfsync switched off, the > box still locked up but this time without a panic. > > We make the DIOCRADDADDRS ioctl on the master and the backup (we use > CARPed pairs). > Interesting. It might be related to pfsync. Is is the master that panics or the backup? Or both? Regards, Kristof ___ freebsd-pf@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-pf To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org"
Re: Kernel Panic
On Thu, Mar 1, 2018 at 9:43 AM, Joe Joneswrote: > Hi Kristo, > > It's just the master that crashed, the backup can take over. > > We think the panic we got by compiling with witness and invariant may be a > red herring. > > We are now looking rules like > > nat on $isp_if from to any -> sticky-address > > if we replace the external_napts table with a single address rather than a > block of addresses the box does not crash. > > We are following this line of investigation at the moment. > This is a known issue and should be documented somewhere, possibly man page. It source is when locking was re-designed for pf(4). https://github.com/freebsd/freebsd/blob/releng/11.1/sys/netpfil/pf/pf_lb.c#L428 * XXXGL: in the round-robin case we need to store * the round-robin machine state in the rule, thus * forwarding thread needs to modify rule. * * This is done w/o locking, because performance is assumed * more important than round-robin precision. * * In the simpliest case we just update the "rpool->cur" * pointer. However, if pool contains tables or dynamic * addresses, then "tblidx" is also used to store machine * state. Since "tblidx" is int, concurrent access to it can't * lead to inconsistence, only to lost of precision. * * Things get worse, if table contains not hosts, but * prefixes. In this case counter also stores machine state, * and for IPv6 address, counter can't be updated atomically. * Probably, using round-robin on a table containing IPv6 * prefixes (or even IPv4) would cause a panic. The fix is to add proper locking around such scenario. At minimum there would be needed a RULES_WLOCK in there or maybe reside to atomics. > Regards > Joe Jones > > > On 01/03/18 09:57, Kristof Provost wrote: > >> On 1 Mar 2018, at 15:37, Joe Jones wrote: >> >>> yes we use pfsync. Yesterday we tried with pfsync switched off, the box >>> still locked up but this time without a panic. >>> >>> We make the DIOCRADDADDRS ioctl on the master and the backup (we use >>> CARPed pairs). >>> >>> Interesting. It might be related to pfsync. Is is the master that panics >> or the backup? Or both? >> >> Regards, >> Kristof >> > > ___ > freebsd-pf@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-pf > To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org" > > -- > Ermal > ___ freebsd-pf@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-pf To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org"
Re: Kernel Panic
Hi Kristo, It's just the master that crashed, the backup can take over. We think the panic we got by compiling with witness and invariant may be a red herring. We are now looking rules like nat on $isp_if from to any -> sticky-address if we replace the external_napts table with a single address rather than a block of addresses the box does not crash. We are following this line of investigation at the moment. Regards Joe Jones On 01/03/18 09:57, Kristof Provost wrote: On 1 Mar 2018, at 15:37, Joe Jones wrote: yes we use pfsync. Yesterday we tried with pfsync switched off, the box still locked up but this time without a panic. We make the DIOCRADDADDRS ioctl on the master and the backup (we use CARPed pairs). Interesting. It might be related to pfsync. Is is the master that panics or the backup? Or both? Regards, Kristof ___ freebsd-pf@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-pf To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org"
Re: Kernel Panic
Hi Kristof, yes we use pfsync. Yesterday we tried with pfsync switched off, the box still locked up but this time without a panic. We make the DIOCRADDADDRS ioctl on the master and the backup (we use CARPed pairs). Regards Joe Jones On 01/03/18 03:00, Kristof Provost wrote: On 28 Feb 2018, at 9:52, Kristof Provost wrote: On 27 Feb 2018, at 20:40, Joe Jones wrote: we have a kernel panic after compiling with witness and invariant Feb 27 13:49:33 sovapn1 kernel: lock order reversal: Feb 27 13:49:33 sovapn1 kernel: 1st 0xfe000fed78b8 pf_idhash (pf_idhash) @ /usr/src/sys/netpfil/pf/pf.c:1078 Feb 27 13:49:33 sovapn1 kernel: 2nd 0xf8001e0474a8 pfsync (pfsync) @ /usr/src/sys/netpfil/pf/if_pfsync.c:1667 That’s a lock order reversal. It’s not good, but it should at worst result in a deadlock. Did the system stop after this? It also looks like a different problem from the panic you initially reported. Also, do you actively use pfsync in this setup? Does the panic happen on the box where you DIOCRADDADDRS or the other(s)? Regards, Kristof ___ freebsd-pf@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-pf To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org"
Re: Kernel Panic
On 1 Mar 2018, at 15:37, Joe Jones wrote: yes we use pfsync. Yesterday we tried with pfsync switched off, the box still locked up but this time without a panic. We make the DIOCRADDADDRS ioctl on the master and the backup (we use CARPed pairs). Interesting. It might be related to pfsync. Is is the master that panics or the backup? Or both? Regards, Kristof ___ freebsd-pf@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-pf To unsubscribe, send any mail to "freebsd-pf-unsubscr...@freebsd.org"