On Tue, May 30, 2023 at 09:08:45PM +0200, Alexander Bluhm wrote:
> Hi,
>
> Currently packets sent over loopback interface get their checksum
> calculated twice. In the output path it is set and during TCP/IP
> input it is calculated again to be compared with the previous value.
>
> This can be avoided by claiming that lo(4) supports hardware checksum
> offloading. For each packet convert the flag that the checksum
> should be calculated to the flag that it has been checked successfully.
so rather than calculate it twice, don't calculate it at all?
i love it.
> In a simple test on a vmm guest I see between 30% to 60% increase
> of thoughput over lo0.
>
> A drawback is that "tcpdump -ni lo0 -v" reports invalid checksum.
> But people are used to that with physical interfaces and hardware
> offloading.
we could export these csum flags as part of the bpf header so we can
teach tcpdump to shut up in this situation.
> The question is, does it break corner cases? Please test with pf
> route-to, IPsec, bridging, IPv6 and other setups where loopback
> might be involved.
who knows how bridge works? otherwise i think it should be ok.
hopefully.
>
> bluhm
>
> Index: net/if.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v
> retrieving revision 1.697
> diff -u -p -r1.697 if.c
> --- net/if.c 16 May 2023 14:32:54 -0000 1.697
> +++ net/if.c 30 May 2023 12:33:42 -0000
> @@ -778,7 +778,7 @@ if_input(struct ifnet *ifp, struct mbuf_
> int
> if_input_local(struct ifnet *ifp, struct mbuf *m, sa_family_t af)
> {
> - int keepflags;
> + int keepflags, keepcksum;
>
> #if NBPFILTER > 0
> /*
> @@ -796,11 +796,26 @@ if_input_local(struct ifnet *ifp, struct
> }
> #endif
> keepflags = m->m_flags & (M_BCAST|M_MCAST);
> + /*
> + * Preserve outgoing checksum flags, in case the packet is
> + * forwarded to another interface. Then the checksum, which
> + * is now incorrect, will be calculated before sending.
> + */
> + keepcksum = m->m_pkthdr.csum_flags & (M_IPV4_CSUM_OUT |
> + M_TCP_CSUM_OUT | M_UDP_CSUM_OUT | M_ICMP_CSUM_OUT);
> m_resethdr(m);
> m->m_flags |= M_LOOP | keepflags;
> + m->m_pkthdr.csum_flags = keepcksum;
> m->m_pkthdr.ph_ifidx = ifp->if_index;
> m->m_pkthdr.ph_rtableid = ifp->if_rdomain;
>
> + if (ISSET(keepcksum, M_TCP_CSUM_OUT))
> + m->m_pkthdr.csum_flags |= M_TCP_CSUM_IN_OK;
> + if (ISSET(keepcksum, M_UDP_CSUM_OUT))
> + m->m_pkthdr.csum_flags |= M_UDP_CSUM_IN_OK;
> + if (ISSET(keepcksum, M_ICMP_CSUM_OUT))
> + m->m_pkthdr.csum_flags |= M_ICMP_CSUM_IN_OK;
> +
> ifp->if_opackets++;
> ifp->if_obytes += m->m_pkthdr.len;
>
> @@ -809,6 +824,8 @@ if_input_local(struct ifnet *ifp, struct
>
> switch (af) {
> case AF_INET:
> + if (ISSET(keepcksum, M_IPV4_CSUM_OUT))
> + m->m_pkthdr.csum_flags |= M_IPV4_CSUM_IN_OK;
> ipv4_input(ifp, m);
> break;
> #ifdef INET6
> Index: net/if_loop.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_loop.c,v
> retrieving revision 1.93
> diff -u -p -r1.93 if_loop.c
> --- net/if_loop.c 21 Oct 2022 14:20:03 -0000 1.93
> +++ net/if_loop.c 30 May 2023 11:48:55 -0000
> @@ -173,6 +173,9 @@ loop_clone_create(struct if_clone *ifc,
> ifp->if_mtu = LOMTU;
> ifp->if_flags = IFF_LOOPBACK | IFF_MULTICAST;
> ifp->if_xflags = IFXF_CLONED;
> + ifp->if_capabilities = IFCAP_CSUM_IPv4 |
> + IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4 |
> + IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
> ifp->if_rtrequest = lortrequest;
> ifp->if_ioctl = loioctl;
> ifp->if_input = loinput;
>