Re: wg(4): encapsulated transport checksums
On Tue, 30 Jun 2020 20:40:10 -0600 "Theo de Raadt" wrote: > Matt Dunwoodie wrote: > > > Depends on your definition of significant, I've seen 1-3% throughput > > improvement without the patch. > > > Real networks require statistics, which you want to throw away. > > > Overall, it is still debatable whether to skip the IPv4 checksum as > > modern crypto certainly offers better integrity checks. However, the > > primary motivator for skipping the integrity checks is performance, > > and the performance isn't severely impacted. Additionally, I can > > sympathise with avoiding layer violations and bringing it inline > > with other tunnels in this case. > > If it is debatable, why don't you debate it? I don't see a debate. > > Let me debate it. > > The issue is not about integrity checks being needed, but about > integrity check counters -- such counters are used to short-cut > procedures during network diagostic failures in multi-configuration > systems. > > If a higher-level network overlay skips that counter updates for > lower-levels, the counters are incorrect, now how do you diagnose > quickly? Well, you don't. Before going any further, I should clarify that the outer packet checksums are already checked (with or without this patch) in ip_input_if and udp_input when being received on the lower-level interface. Therefore, any lower-level link layer corruption to the outer packet will be caught, dropped and counters incremented before being passed to wg_input. Does that change any of your following points? > It sounds like the overlay is being chosen and relevant as more > important than the underlay. Sorry to burst your bubble, but the > overlay will never be the whole internet. The underlay will persist > for a long time, and the underlay will see errors. But the counters > indicating those erors will be *incoherent*. > > To me, it seems your path leads to the inablity to diagnose underlying > issues correctly and quickly > > Are underlying issues suddenly absent, or rare enough, they don't need > quick diagnosis? > > Or do (all) overlay technologies now provide enough information > access to make evaluation of underlying failures easy? > > For those questions, in my experience, I don't think reality provides > easy paths yet. > > As I said, argue it from a non-wg diagnosis model. If the argument is > not convincing, we have to do the obvious right thing, even if it > costs a small amount. > > Honestly, i don't understand how you ended in the position you are. Now, the "debatability" is about whether we want to check the inner packet IPv4 checksum after successful decryption (not about counters). The story may be different if you have any cases to add to the three Jason sent through earlier. I said "debatable" because I still think both sides are vaild, however when I weigh up the arguments I see applying the patch as the right thing to do. - Matt
Re: wg(4): encapsulated transport checksums
Matt Dunwoodie wrote: > Depends on your definition of significant, I've seen 1-3% throughput > improvement without the patch. > Real networks require statistics, which you want to throw away. > Overall, it is still debatable whether to skip the IPv4 checksum as > modern crypto certainly offers better integrity checks. However, the > primary motivator for skipping the integrity checks is performance, and > the performance isn't severely impacted. Additionally, I can sympathise > with avoiding layer violations and bringing it inline with other > tunnels in this case. If it is debatable, why don't you debate it? I don't see a debate. Let me debate it. The issue is not about integrity checks being needed, but about integrity check counters -- such counters are used to short-cut procedures during network diagostic failures in multi-configuration systems. If a higher-level network overlay skips that counter updates for lower-levels, the counters are incorrect, now how do you diagnose quickly? Well, you don't. It sounds like the overlay is being chosen and relevant as more important than the underlay. Sorry to burst your bubble, but the overlay will never be the whole internet. The underlay will persist for a long time, and the underlay will see errors. But the counters indicating those erors will be *incoherent*. To me, it seems your path leads to the inablity to diagnose underlying issues correctly and quickly Are underlying issues suddenly absent, or rare enough, they don't need quick diagnosis? Or do (all) overlay technologies now provide enough information access to make evaluation of underlying failures easy? For those questions, in my experience, I don't think reality provides easy paths yet. As I said, argue it from a non-wg diagnosis model. If the argument is not convincing, we have to do the obvious right thing, even if it costs a small amount. Honestly, i don't understand how you ended in the position you are.
Re: wg(4): encapsulated transport checksums
On Tue, 30 Jun 2020 09:22:18 +1200 (NZST) richard.n.proc...@gmail.com wrote: > Hi Jason, > > On 27/06/2020, at 10:09 PM, Jason A. Donenfeld > wrote: > > [...] I thought I'd lay out my understanding of the matter, > > and you can let me know whether or not this corresponds with what > > you had in mind: > > [...] > > My main concern is the end-to-end TCP or UDP transport checksum of the > tunneled (or inner, or encapsulated) packet. Your argument seems to > concern instead the per-hop IPv4 header checksum (which is also > interesting to look at, but first things first). > > As I read it, wg_decap() tells the stack to ignore the transport > checksum of the tunneled packet, and I think this is incorrect for > the following reason: the packet may have been corrupted outside of > the tunnel, because the tunnel needn't cover the entire path the > packet took through the network. > > So a tunneled packet may be corrupt, and one addressed to the tunnel > endpoint will be delivered with its end-to-end transport checksum > unverified. Higher layers may receive corrupt data as as result.[*] > > (Routers, as intermediate network elements, are under no obligation to > check end-to-end transport checksums. It isn't their job.) > > One might argue (I do not) that this isn't a concern, because > encryption these days is ubiquitous and comes with far stronger > integrity checks. Nonetheless, even here transport checksums remain > relevant for two reasons. Firsly, because networks remain unreliable: > I see non-zero numbers of failed checksums on my systems. And, > secondly, because higher layers require a reliable transport: TLS for > instance will drop the connection when the MAC check fails[1]. > > Let's turn to the per-hop IPv4 checksum, which wg_decap() also tells > the stack to ignore. I suspect the cost of verifying that checksum -- > 20 bytes on a hot cache line -- is negligible both absolutely and > relative to the cost of decrypting the packet. And as the > optimisation appears nowhere else in the tree that I can find, > removing it would make wg(4) no worse than the other tunnels. Do you > have evidence that the cost is in fact significant? Depends on your definition of significant, I've seen 1-3% throughput improvement without the patch. Without patch: --- 10.0.0.2 tcpbench statistics --- 4455088992 bytes sent over 121.001 seconds bandwidth min/avg/max/std-dev = 122.940/294.737/324.080/30.782 Mbps With patch: --- 10.0.0.2 tcpbench statistics --- 4366145736 bytes sent over 120.999 seconds bandwidth min/avg/max/std-dev = 123.908/288.718/329.770/31.518 Mbps > Further, as Theo pointed out, the link layer has in any case no > business with the IPv4 checksum, being part of the network layer. > Layer violations are problematic for at least two reasons. Firstly, > they constrict the design space. For instance, I suppose the IPv4 > checksum needn't be per-hop. It might be updated incrementally, > increasing its scope. But this would be nullified by any link layer > that, acting on a false assumption about the layer above it, told the > stack to ignore that layer's checksum. Secondly, layer violations, an > optimisation, which almost by definition rely on additional premises, > burden the correctness obligation and so trade less work for the > computer in narrow circumstances for (often much) more work for the > fallible human to show that the process remains sound. > > On that balance I now think that skipping the IPv4 check a false > economy, too. Hopefully I have managed to make my reasons clearer > this time around, please let me know if not, or if you disagree. > > See updated patch below, looking for oks to commit that. Overall, it is still debatable whether to skip the IPv4 checksum as modern crypto certainly offers better integrity checks. However, the primary motivator for skipping the integrity checks is performance, and the performance isn't severely impacted. Additionally, I can sympathise with avoiding layer violations and bringing it inline with other tunnels in this case. I'm happy for the following patch to be committed. - Matt > cheers, > Richard. > > [*] By way of background, I understand the transport checksum was > added to the early ARPANET primarily to guard against corruption > inside its routers[0], after a router fault which "[brought] the > network to its knees". In other words, end-to-end checks are > necessary to catch faults in the routers; link-level checks aren't > enough. > > More generally, transport checksums set a lower bound on the > reliability of the end-to-end transport, which decouples it from the > reliability of the underlying network. > > [0] Kleinrock "Queuing Systems, Volume 2: Computer Applications", > John Wiley and Sons (1976), p441 > > [1] RFC5246 TLS v1.2 (2008), section 7.2.2. > > TLS's requirement is reasonable: it would otherwise be obliged to > reimplement for itself retransmission, which would drag with it other > transport layer mechanism
Re: wg(4): encapsulated transport checksums
On Tue, Jun 30, 2020 at 09:22:18AM +1200, richard.n.proc...@gmail.com wrote: > Hi Jason, > > On 27/06/2020, at 10:09 PM, Jason A. Donenfeld wrote: > > [...] I thought I'd lay out my understanding of the matter, > > and you can let me know whether or not this corresponds with what you > > had in mind: > > [...] > > My main concern is the end-to-end TCP or UDP transport checksum of the > tunneled (or inner, or encapsulated) packet. Your argument seems to > concern instead the per-hop IPv4 header checksum (which is also > interesting to look at, but first things first). > > As I read it, wg_decap() tells the stack to ignore the transport checksum > of the tunneled packet, and I think this is incorrect for the following > reason: the packet may have been corrupted outside of the tunnel, because > the tunnel needn't cover the entire path the packet took through the > network. > > So a tunneled packet may be corrupt, and one addressed to the tunnel > endpoint will be delivered with its end-to-end transport checksum > unverified. Higher layers may receive corrupt data as as result.[*] > > (Routers, as intermediate network elements, are under no obligation to > check end-to-end transport checksums. It isn't their job.) > > One might argue (I do not) that this isn't a concern, because encryption > these days is ubiquitous and comes with far stronger integrity checks. > Nonetheless, even here transport checksums remain relevant for two > reasons. Firsly, because networks remain unreliable: I see non-zero > numbers of failed checksums on my systems. And, secondly, because higher > layers require a reliable transport: TLS for instance will drop the > connection when the MAC check fails[1]. > > Let's turn to the per-hop IPv4 checksum, which wg_decap() also tells the > stack to ignore. I suspect the cost of verifying that checksum -- 20 bytes > on a hot cache line -- is negligible both absolutely and relative to the > cost of decrypting the packet. And as the optimisation appears nowhere > else in the tree that I can find, removing it would make wg(4) no worse > than the other tunnels. Do you have evidence that the cost is in fact > significant? > > Further, as Theo pointed out, the link layer has in any case no business > with the IPv4 checksum, being part of the network layer. Layer violations > are problematic for at least two reasons. Firstly, they constrict the > design space. For instance, I suppose the IPv4 checksum needn't be > per-hop. It might be updated incrementally, increasing its scope. But this > would be nullified by any link layer that, acting on a false assumption > about the layer above it, told the stack to ignore that layer's checksum. > Secondly, layer violations, an optimisation, which almost by definition > rely on additional premises, burden the correctness obligation and so > trade less work for the computer in narrow circumstances for (often much) > more work for the fallible human to show that the process remains sound. > > On that balance I now think that skipping the IPv4 check a false economy, > too. Hopefully I have managed to make my reasons clearer this time around, > please let me know if not, or if you disagree. > > See updated patch below, looking for oks to commit that. > > cheers, > Richard. > > [*] By way of background, I understand the transport checksum was added to > the early ARPANET primarily to guard against corruption inside its > routers[0], after a router fault which "[brought] the network to its > knees". In other words, end-to-end checks are necessary to catch faults in > the routers; link-level checks aren't enough. > > More generally, transport checksums set a lower bound on the reliability > of the end-to-end transport, which decouples it from the reliability of > the underlying network. > > [0] Kleinrock "Queuing Systems, Volume 2: Computer Applications", > John Wiley and Sons (1976), p441 > > [1] RFC5246 TLS v1.2 (2008), section 7.2.2. > > TLS's requirement is reasonable: it would otherwise be obliged to > reimplement for itself retransmission, which would drag with it other > transport layer mechanisms like acknowledgements and (for performance) > sequence numbers and sliding windows. I agree with all of this and think that this is the right fix. OK claudio@ > Index: net/if_wg.c > === > RCS file: /cvs/src/sys/net/if_wg.c,v > retrieving revision 1.7 > diff -u -p -u -p -r1.7 if_wg.c > --- net/if_wg.c 23 Jun 2020 10:03:49 - 1.7 > +++ net/if_wg.c 28 Jun 2020 20:17:07 - > @@ -1660,14 +1660,8 @@ wg_decap(struct wg_softc *sc, struct mbu > goto error; > } > > - /* > - * We can mark incoming packet csum OK. We mark all flags OK > - * irrespective to the packet type. > - */ > - m->m_pkthdr.csum_flags |= (M_IPV4_CSUM_IN_OK | M_TCP_CSUM_IN_OK | > - M_UDP_CSUM_IN_OK | M_ICMP_CSUM_I
Re: wg(4): encapsulated transport checksums
Hi Jason, On 27/06/2020, at 10:09 PM, Jason A. Donenfeld wrote: > [...] I thought I'd lay out my understanding of the matter, > and you can let me know whether or not this corresponds with what you > had in mind: > [...] My main concern is the end-to-end TCP or UDP transport checksum of the tunneled (or inner, or encapsulated) packet. Your argument seems to concern instead the per-hop IPv4 header checksum (which is also interesting to look at, but first things first). As I read it, wg_decap() tells the stack to ignore the transport checksum of the tunneled packet, and I think this is incorrect for the following reason: the packet may have been corrupted outside of the tunnel, because the tunnel needn't cover the entire path the packet took through the network. So a tunneled packet may be corrupt, and one addressed to the tunnel endpoint will be delivered with its end-to-end transport checksum unverified. Higher layers may receive corrupt data as as result.[*] (Routers, as intermediate network elements, are under no obligation to check end-to-end transport checksums. It isn't their job.) One might argue (I do not) that this isn't a concern, because encryption these days is ubiquitous and comes with far stronger integrity checks. Nonetheless, even here transport checksums remain relevant for two reasons. Firsly, because networks remain unreliable: I see non-zero numbers of failed checksums on my systems. And, secondly, because higher layers require a reliable transport: TLS for instance will drop the connection when the MAC check fails[1]. Let's turn to the per-hop IPv4 checksum, which wg_decap() also tells the stack to ignore. I suspect the cost of verifying that checksum -- 20 bytes on a hot cache line -- is negligible both absolutely and relative to the cost of decrypting the packet. And as the optimisation appears nowhere else in the tree that I can find, removing it would make wg(4) no worse than the other tunnels. Do you have evidence that the cost is in fact significant? Further, as Theo pointed out, the link layer has in any case no business with the IPv4 checksum, being part of the network layer. Layer violations are problematic for at least two reasons. Firstly, they constrict the design space. For instance, I suppose the IPv4 checksum needn't be per-hop. It might be updated incrementally, increasing its scope. But this would be nullified by any link layer that, acting on a false assumption about the layer above it, told the stack to ignore that layer's checksum. Secondly, layer violations, an optimisation, which almost by definition rely on additional premises, burden the correctness obligation and so trade less work for the computer in narrow circumstances for (often much) more work for the fallible human to show that the process remains sound. On that balance I now think that skipping the IPv4 check a false economy, too. Hopefully I have managed to make my reasons clearer this time around, please let me know if not, or if you disagree. See updated patch below, looking for oks to commit that. cheers, Richard. [*] By way of background, I understand the transport checksum was added to the early ARPANET primarily to guard against corruption inside its routers[0], after a router fault which "[brought] the network to its knees". In other words, end-to-end checks are necessary to catch faults in the routers; link-level checks aren't enough. More generally, transport checksums set a lower bound on the reliability of the end-to-end transport, which decouples it from the reliability of the underlying network. [0] Kleinrock "Queuing Systems, Volume 2: Computer Applications", John Wiley and Sons (1976), p441 [1] RFC5246 TLS v1.2 (2008), section 7.2.2. TLS's requirement is reasonable: it would otherwise be obliged to reimplement for itself retransmission, which would drag with it other transport layer mechanisms like acknowledgements and (for performance) sequence numbers and sliding windows. Index: net/if_wg.c === RCS file: /cvs/src/sys/net/if_wg.c,v retrieving revision 1.7 diff -u -p -u -p -r1.7 if_wg.c --- net/if_wg.c 23 Jun 2020 10:03:49 - 1.7 +++ net/if_wg.c 28 Jun 2020 20:17:07 - @@ -1660,14 +1660,8 @@ wg_decap(struct wg_softc *sc, struct mbu goto error; } - /* -* We can mark incoming packet csum OK. We mark all flags OK -* irrespective to the packet type. -*/ - m->m_pkthdr.csum_flags |= (M_IPV4_CSUM_IN_OK | M_TCP_CSUM_IN_OK | - M_UDP_CSUM_IN_OK | M_ICMP_CSUM_IN_OK); - m->m_pkthdr.csum_flags &= ~(M_IPV4_CSUM_IN_BAD | M_TCP_CSUM_IN_BAD | - M_UDP_CSUM_IN_BAD | M_ICMP_CSUM_IN_BAD); + /* tunneled packet was not offloaded */ + m->m_pkthdr.csum_flags = 0; m->m_pkthdr.ph_ifidx = sc->sc_if.if_index; m->m_pkthdr.ph_rtableid = sc->sc_if.if_rdomain;
Re: wg(4): encapsulated transport checksums
> - Therefore, it's not necessary to check the IP checksum on ingress because: There is actually a really good reason. There are various counters (of all packets) which people observe to debug network problems. Now, if lower-level packets carrying wg with corruption don't increment those counters, the statistics will be incorrect. I think you are arguying to elide mandatory work in a lower layer of network stack, isn't it a layer violation to insist like that?
Re: wg(4): encapsulated transport checksums
Hi Richard, Thanks for the patch. I had problems parsing some terminology in your description, so I thought I'd lay out my understanding of the matter, and you can let me know whether or not this corresponds with what you had in mind: - On egress, we must compute the packet checksum, because it may well be forwarded by the receiving end after decapsulation. That doesn't concern this patch, however. - On ingress, we've already checked the poly1305 sum, so we have no doubt that the packet has arrived without corruption. - Therefore, it's not necessary to check the IP checksum on ingress because: * If the packet originated on the peer that did the encapsulation, there's no chance for corruption; * If the packet did not originate on the peer that did the encapsulation, it was that peer's responsibility to drop it if the checksum was wrong; * If the packet does have an incorrect checksum, because the originating peer did not check it, and we forward it along, the machine we forward it to will drop it. It seemed like from your message that you had a case in mind in which it actually would be necessary to check the IP checksum on ingress, but I didn't quite divine what you had in mind. Jason On Fri, Jun 26, 2020 at 10:03 PM wrote: > > Hi, > > On its receive path, wg(4) uses the same mbuf for both the encrypted > capsule and its encapsulated packet, which it passes up to the stack. We > must therefore clear this mbuf's checksum status flags, as although the > capsule may have been subject to hardware offload, its encapsulated packet > was not. > > This ensures that the transport checksums of packets bound for local > delivery are verified. That is necessary because, although the tunnel > provides stronger integrity checks, the tunnel endpoints and the > transport endpoints needn't coincide. > > However, as the network and tunnel endpoints _do_ conincide, it remains > unncessary to check the per-hop IPv4 checksum. > > ok? > > Index: net/if_wg.c > === > RCS file: /cvs/src/sys/net/if_wg.c,v > retrieving revision 1.7 > diff -u -p -u -p -r1.7 if_wg.c > --- net/if_wg.c 23 Jun 2020 10:03:49 - 1.7 > +++ net/if_wg.c 27 Jun 2020 02:48:37 - > @@ -1660,14 +1660,10 @@ wg_decap(struct wg_softc *sc, struct mbu > goto error; > } > > - /* > -* We can mark incoming packet csum OK. We mark all flags OK > -* irrespective to the packet type. > -*/ > - m->m_pkthdr.csum_flags |= (M_IPV4_CSUM_IN_OK | M_TCP_CSUM_IN_OK | > - M_UDP_CSUM_IN_OK | M_ICMP_CSUM_IN_OK); > - m->m_pkthdr.csum_flags &= ~(M_IPV4_CSUM_IN_BAD | M_TCP_CSUM_IN_BAD | > - M_UDP_CSUM_IN_BAD | M_ICMP_CSUM_IN_BAD); > + /* tunneled packet was not offloaded */ > + m->m_pkthdr.csum_flags = 0; > + /* optimise: the tunnel provided a stronger integrity check */ > + m->m_pkthdr.csum_flags |= M_IPV4_CSUM_IN_OK; > > m->m_pkthdr.ph_ifidx = sc->sc_if.if_index; > m->m_pkthdr.ph_rtableid = sc->sc_if.if_rdomain;
wg(4): encapsulated transport checksums
Hi, On its receive path, wg(4) uses the same mbuf for both the encrypted capsule and its encapsulated packet, which it passes up to the stack. We must therefore clear this mbuf's checksum status flags, as although the capsule may have been subject to hardware offload, its encapsulated packet was not. This ensures that the transport checksums of packets bound for local delivery are verified. That is necessary because, although the tunnel provides stronger integrity checks, the tunnel endpoints and the transport endpoints needn't coincide. However, as the network and tunnel endpoints _do_ conincide, it remains unncessary to check the per-hop IPv4 checksum. ok? Index: net/if_wg.c === RCS file: /cvs/src/sys/net/if_wg.c,v retrieving revision 1.7 diff -u -p -u -p -r1.7 if_wg.c --- net/if_wg.c 23 Jun 2020 10:03:49 - 1.7 +++ net/if_wg.c 27 Jun 2020 02:48:37 - @@ -1660,14 +1660,10 @@ wg_decap(struct wg_softc *sc, struct mbu goto error; } - /* -* We can mark incoming packet csum OK. We mark all flags OK -* irrespective to the packet type. -*/ - m->m_pkthdr.csum_flags |= (M_IPV4_CSUM_IN_OK | M_TCP_CSUM_IN_OK | - M_UDP_CSUM_IN_OK | M_ICMP_CSUM_IN_OK); - m->m_pkthdr.csum_flags &= ~(M_IPV4_CSUM_IN_BAD | M_TCP_CSUM_IN_BAD | - M_UDP_CSUM_IN_BAD | M_ICMP_CSUM_IN_BAD); + /* tunneled packet was not offloaded */ + m->m_pkthdr.csum_flags = 0; + /* optimise: the tunnel provided a stronger integrity check */ + m->m_pkthdr.csum_flags |= M_IPV4_CSUM_IN_OK; m->m_pkthdr.ph_ifidx = sc->sc_if.if_index; m->m_pkthdr.ph_rtableid = sc->sc_if.if_rdomain;