Hi, On Wed, Jul 16, 2025 at 08:44:55AM -0400, Aaron Conole wrote: > Guillaume Nault <[email protected]> writes: > > > On Mon, Jul 14, 2025 at 09:57:52PM +0200, Salvatore Bonaccorso wrote: > >> Hi, > >> > >> Charles Bordet reported the following issue (full context in > >> https://bugs.debian.org/1108860) > >> > >> > Dear Maintainer, > >> > > >> > What led up to the situation? > >> > We run a production environment using Debian 12 VMs, with a network > >> > topology involving VXLAN tunnels encapsulated inside Wireguard > >> > interfaces. This setup has worked reliably for over a year, with MTU set > >> > to 1500 on all interfaces except the Wireguard interface (set to 1420). > >> > Wireguard kernel fragmentation allowed this configuration to function > >> > without issues, even though the effective path MTU is lower than 1500. > >> > > >> > What exactly did you do (or not do) that was effective (or ineffective)? > >> > We performed a routine system upgrade, updating all packages include the > >> > kernel. After the upgrade, we observed severe network issues (timeouts, > >> > very slow HTTP/HTTPS, and apt update failures) on all VMs behind the > >> > router. SSH and small-packet traffic continued to work. > >> > > >> > To diagnose, we: > >> > > >> > * Restored a backup (with the previous kernel): the problem disappeared. > >> > * Repeated the upgrade, confirming the issue reappeared. > >> > * Systematically tested each kernel version from 6.1.124-1 up to > >> > 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier > >> > versions work as expected. > >> > * Kernel version from the backports (6.12.32-1) did not resolve the > >> > problem. > >> > > >> > What was the outcome of this action? > >> > > >> > * With kernel 6.1.135-1 or later, network timeouts occur for > >> > large-packet protocols (HTTP, apt, etc.), while SSH and small-packet > >> > protocols work. > >> > * With kernel 6.1.133-1 or earlier, everything works as expected. > >> > > >> > What outcome did you expect instead? > >> > We expected the network to function as before, with Wireguard handling > >> > fragmentation transparently and no application-level timeouts, > >> > regardless of the kernel version. > >> > >> While triaging the issue we found that the commit 8930424777e4 > >> ("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces > >> the issue and Charles confirmed that the issue was present as well in > >> 6.12.35 and 6.15.4 (other version up could potentially still be > >> affected, but we wanted to check it is not a 6.1.y specific > >> regression). > >> > >> Reverthing the commit fixes Charles' issue. > >> > >> Does that ring a bell? > > > > It doesn't ring a bell. Do you have more details on the setup that has > > the problem? Or, ideally, a self-contained reproducer? > > +1 - I tested this patch with an OVS setup using vxlan and geneve > tunnels. A reproducer or more details would help.
Charles, any news here, did you found a way to provide a self-contained reproducer for your issue? Does the issue still reproeduce for you on the most current version of each of the affected dstable series? Regards, Salvatore

