On 31 Mar 2020, at 17:28, Kristof Provost wrote:
On 31 Mar 2020, at 17:17, Mark Johnston wrote:
On Tue, Mar 31, 2020 at 03:51:27PM +0800, Li-Wen Hsu wrote:
On Tue, Mar 31, 2020 at 3:00 PM Kristof Provost <k...@freebsd.org> wrote:

On 31 Mar 2020, at 7:56, Li-Wen Hsu wrote:
On Tue, Mar 31, 2020 at 10:55 AM Mark Johnston <ma...@freebsd.org> wrote:
It seems could be triggered by sys.netinet6.frag6.*
sys.netpfil.common.* sbin.pfctl.pfctl_test.* tests, and there are lots
of test cases timed out.

Can you help check these?

I see, it is actually caused by r359438.  I'm looking at it now.

I verified that the netpfil and netinet6 tests pass with r359477.

Thanks for the fixing, the latest test panics at epair_qflush:

https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14747/consoleFull

while executing sys.netpfil.pf.* tests. I'm not sure if this is
related or because of previous commits (I suspect the later). I'll
look into this.

That’s a know issue with epair (since EPOCH, I believe).
A number of the pf tests are disabled due to this. See 238870.

I also think so, btw, currently every test run panics so I am afraid
that the recent commits might make status worse (or say, make the
issue easier to reproduce?)

I haven't been able to reproduce any panics or test failures so far.

Once you disable the ‘atf_skip’ lines in the pf tests a simple `sudo kldload pfsync && cd /usr/tests/sys/netpfil/pf && sudo kyua test` is likely sufficient.

The names:names test is a great candidate for this. Remove the `atf_skip …` line in /usr/tests/sys/netpfil/pf/names and run that a few times. It’s not 100% reliable, but the test is very fast and will likely panic every other run or more.

Example backtrace:

        panic: epair_qflush: ifp=0xfffff800079c9000, epair_softc gone? sc=0

        cpuid = 1
        time = 1585666518
        KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001bd7e790
        vpanic() at vpanic+0x182/frame 0xfffffe001bd7e7e0
        panic() at panic+0x43/frame 0xfffffe001bd7e840
        epair_qflush() at epair_qflush+0x1a8/frame 0xfffffe001bd7e890
        if_down() at if_down+0x12d/frame 0xfffffe001bd7e8c0
if_detach_internal() at if_detach_internal+0x2ee/frame 0xfffffe001bd7e920
        if_vmove() at if_vmove+0x3c/frame 0xfffffe001bd7e970
        vnet_if_return() at vnet_if_return+0x50/frame 0xfffffe001bd7e990
        vnet_destroy() at vnet_destroy+0x130/frame 0xfffffe001bd7e9c0
        prison_deref() at prison_deref+0x29d/frame 0xfffffe001bd7ea00
taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe001bd7ea80 taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 0xfffffe001bd7eab0
        fork_exit() at fork_exit+0x80/frame 0xfffffe001bd7eaf0
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001bd7eaf0
        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
        KDB: enter: panic
        [ thread pid 0 tid 100014 ]
        Stopped at      kdb_enter+0x37: movq    $0,0x10927a6(%rip)
        db>

You might see different panics too. The epair teardown flow is complex, and broken.

Best regards,
Kristof
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to