Hi,

I have seen this crash trace on a 6.6 based system, but I think the
bug exists still in -current.  It happened when an ixl(4) interface
was removed from trunk(4).

uvm_fault(0xfffffd8739dc6558, 0x0, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff81012a86 cs 8 rflags 10202 cr2 0 cpl 7 rsp 
ffff80002e4bd170
gsbase 0xffffffff816d6ff0  kgsbase 0x0
panic: trap type 6, code=0, pc=ffffffff81012a86
Starting stack trace...
panic() at panic+0x113
kerntrap(ffff800000bf3000) at kerntrap+0xdc
alltraps_kern_meltdown(6,0,4,0,ffff800000bf3000,0) at 
alltraps_kern_meltdown+0x7b
ixl_intr(ffff800000bf3000) at ixl_intr+0x3e6
intr_handler(ffffffff816d6ff0,ffff800000b57200) at intr_handler+0x5b
Xintr_ioapic_edge30_untramp(4,ffffffff814d3a00,4,18041969,ffffffff816d6ff0,d) 
at Xintr_ioapic_edge30_untramp+0x19f
Xspllower(ffffffff8178fb58,ffffffff816d6ff0,ffffffff8139d743,ffff800000bffd00,ffffffff8178fb40,10)
 at Xspllower+0xc
softintr_dispatch(0) at softintr_dispatch+0xc5
Xsoftclock(ffff800000bf3048,0,ffffffff8139d743,ffff800000bf3048,ffff800001207800,ffffffff814e862f)
 at Xsoftclock+0x1f
ifnewlladdr(ffff800001624e10) at ifnewlladdr+0xf8
trunk_port_destroy(ffff80002e4bd6e0) at trunk_port_destroy+0x2fd
trunk_ioctl(fffffd87387594c0,ffff800001207800,8048698e) at trunk_ioctl+0x6a6
ifioctl(fffffd87623df448,ffff80002e46e2d8,48,ffff80002e4bd7d0) at ifioctl+0x2d6
sys_ioctl(360,ffff80002e46e2d8,36) at sys_ioctl+0x3cd
syscall(0) at syscall+0x3d1
Xsyscall(6,36,1,36,7f7ffffda720,7f7ffffdaa21) at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7ffffda780, count: 241
End of stack trace.
syncing disks...

ifnewlladdr() is interrupted by ixl transmit interrupt.  There it
crashes in ixl_txeof as txr is NULL.  The code in -current if_ixl.c
has changed, so it might not happen anymore.  But I think the bug
is in ifnewlladdr().

ifnewlladdr() sets splnet() and configures the interface up and
down.  The ixl_down() code has some interrupt barriers which cannot
work while interrupts are blocked by splnet().  So interrupts fire
at splx() when the driver does not expect them.

Combining interrupt barriers with spl protection looks like a bad
idea.

Is there anything that lowers spl in all cases during intr_barrier(),
ifq_barrier() or timeout_del_barrier()?

How should spls work together with barriers?

The integrity of ifnewlladdr() state should be guaranteed by netlock.
Changing if_flags needs splnet() as they are used by all drivers.
The in6_ functions need netlock.  And driver SIOCSIFFLAGS ioctl
must not have splnet().

Is reducing splnet() the correct aproach?

bluhm

Index: net/if.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v
retrieving revision 1.641
diff -u -p -r1.641 if.c
--- net/if.c    25 May 2021 22:45:09 -0000      1.641
+++ net/if.c    10 Jun 2021 14:32:12 -0000
@@ -3109,6 +3109,8 @@ ifnewlladdr(struct ifnet *ifp)
        short up;
        int s;
 
+       NET_ASSERT_LOCKED();
+
        s = splnet();
        up = ifp->if_flags & IFF_UP;
 
@@ -3116,11 +3118,14 @@ ifnewlladdr(struct ifnet *ifp)
                /* go down for a moment... */
                ifp->if_flags &= ~IFF_UP;
                ifrq.ifr_flags = ifp->if_flags;
+               splx(s);
                (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
+               s = splnet();
        }
 
        ifp->if_flags |= IFF_UP;
        ifrq.ifr_flags = ifp->if_flags;
+       splx(s);
        (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
 
 #ifdef INET6
@@ -3139,11 +3144,12 @@ ifnewlladdr(struct ifnet *ifp)
 #endif
        if (!up) {
                /* go back down */
+               s = splnet();
                ifp->if_flags &= ~IFF_UP;
                ifrq.ifr_flags = ifp->if_flags;
+               splx(s);
                (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifrq);
        }
-       splx(s);
 }
 
 void

Reply via email to