On Fri, Apr 10, 2020 at 09:51:40AM +0200, Martin Pieuchot wrote: > On 09/04/20(Thu) 16:10, Massimiliano Stucchi wrote: > > >Synopsis: Crash while using ospfd over vxlan > > >Category: bug > > >Environment: > > System : OpenBSD 6.6 > > Details : OpenBSD 6.6 (GENERIC.MP) #5: Sun Feb 16 01:56:11 MST 2020 > > > > r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > > > Architecture: OpenBSD.amd64 > > Machine : amd64 > > >Description: > > Setting up an OSPF session over VXLAN leads to a kernel crash > > >How-To-Repeat: > > > > I have setup an ospf session over a vxlan interface. When this is up, > > it takes about 2-3 minutes for the crash to consistently happen. > > > > No other action is necessary. > > > > At this address: > > > > https://max.stucchi.ch/bugreport/ > > > > you can find screenshots from the ddb prompt, including a full trace. > > > > If needed, I can also provide access to the console. > > It's a recursion. I don't know anything about vxlan(4) or how the > encapsulation works but the following happens at least 10 times: > > ... > vxlan_lookup() > udp_input() > ip_deliver() > ip_ours() > ip_input_if() > ipv4_input() > ether_input() > if_vinput() > vxlan_lookup() > ... > > Maybe you can share your setup (vxlan config, ospf config, etc) so > somebody can try to reproduce and fix it.
Possible recursion through encap drivers is a problem they all share. The vxlan driver should probably have the following text copied from one of the other manpages into it: For correct operation, encapsulated traffic must not be routed over the interface itself. This can be implemented by adding a distinct or a more specific route to the tunnel destination than the hosts or networks routed via the tunnel interface. Alternatively, the tunnel traffic may be configured in a separate routing table to the encapsulated traffic. Misconfiguration shouldn't result in a panic or fault though, so can you try the following diff? It copies the mechanism used to prevent recursion into vxlan. There's some more drivers that don't do this which I'll try and fix up in the next few days. Cheers, dlg Index: if_vxlan.c =================================================================== RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.76 diff -u -p -r1.76 if_vxlan.c --- if_vxlan.c 8 Nov 2019 07:16:29 -0000 1.76 +++ if_vxlan.c 12 Apr 2020 01:36:26 -0000 @@ -82,6 +82,8 @@ struct vxlan_softc { void vxlanattach(int); int vxlanioctl(struct ifnet *, u_long, caddr_t); +int vxlanoutput(struct ifnet *, struct mbuf *, struct sockaddr *, + struct rtentry *); void vxlanstart(struct ifnet *); int vxlan_clone_create(struct if_clone *, int); int vxlan_clone_destroy(struct ifnet *); @@ -150,6 +152,7 @@ vxlan_clone_create(struct if_clone *ifc, ifp->if_softc = sc; ifp->if_ioctl = vxlanioctl; + ifp->if_output = vxlanoutput; ifp->if_start = vxlanstart; IFQ_SET_MAXLEN(&ifp->if_snd, IFQ_MAXLEN); @@ -294,6 +297,33 @@ vxlan_multicast_join(struct ifnet *ifp, if_detachhook_add(mifp, &sc->sc_dtask); return (0); +} + +int +vxlanoutput(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst, + struct rtentry *rt) +{ + struct m_tag *mtag; + + /* Try to limit infinite recursion through misconfiguration. */ + for (mtag = m_tag_find(m, PACKET_TAG_GRE, NULL); mtag; + mtag = m_tag_find(m, PACKET_TAG_GRE, mtag)) { + if (memcmp((caddr_t)(mtag + 1), &ifp->if_index, + sizeof(ifp->if_index)) == 0) { + m_freem(m); + return (EIO); + } + } + + mtag = m_tag_get(PACKET_TAG_GRE, sizeof(ifp->if_index), M_NOWAIT); + if (mtag == NULL) { + m_freem(m); + return (ENOMEM); + } + memcpy((caddr_t)(mtag + 1), &ifp->if_index, sizeof(ifp->if_index)); + m_tag_prepend(m, mtag); + + return (ether_output(ifp, m, dst, rt)); } void