Very cool that this is fixed! Mark Schouten
> Op 2 jul. 2021 om 22:58 heeft Thomas Lamprecht <[email protected]> het > volgende geschreven: > > On 29.06.21 10:05, Mark Schouten wrote: >> Hi, >> >> Op 24-06-2021 om 15:16 schreef Martin Maurer: >>> We are pleased to announce the first beta release of Proxmox Virtual >>> Environment 7.0! The 7.x family is based on the great Debian 11 "Bullseye" >>> and comes with a 5.11 kernel, QEMU 6.0, LXC 4.0, OpenZFS 2.0.4. >> >> I just upgraded a node in our demo cluster and all seemed fine. Except for >> non-working cluster network. I was unable to ping the node through the >> cluster interface, pvecm saw no other nodes and ceph was broken. >> >> However, if I ran tcpdump, ping started working, but not the rest. >> >> Interesting situation, which I 'fixed' by disabling vlan-aware-bridge for >> that interface. After the reboot, everything works (AFAICS). >> >> If Proxmox wants to debug this, feel free to reach out to me, I can grant >> you access to this node so you can check it out. >> > > FYI, there was some more investigation regarding this, mostly spear headed by > Wolfgang, > and we found and fixed[0] an actual, rather old (fixes commit is from 2014!), > bridge bug > in the kernel. > > The first few lines of the fix's commit message[0] explain the basics: > >> [..] bridges with `vlan_filtering 1` and only 1 auto-port don't >> set IFF_PROMISC for unicast-filtering-capable ports. > > Further, we saw all that weird behavior as > * while this is independent of any specific network driver, those specific > drivers > vary wildly in how the do things, and some thus worked (by luck) while > others did > not. > > * It can really only happen in the vlan-aware case, as else all ports are set > promisc > no matter what, but depending in which order things are done the result may > still > differ even with vlan-aware on > > * It did not matter before (i.e., before systemd started to also apply their > MACAddressPolicy by default onto virtual devices like bridges) because then > the > bridge basically always had a MAC from one of it's ports, so the fdb always > contained the bridge's MAC implicitly and the bug was concealed. > > So it's quite likely that this rather confusing mix of behaviors would had > pop up > in more places, where bridges are used, in the upcoming months when that > systemd > change slowly rolled into stable distros, so actually really nice to find and > fix > (*knocks wood*) this during beta! > > Anyhow, a newer kernel build is now also available in the bullseye based > pvetest > repository, if you want to test and confirm the fix: > > pve-kernel-5.11.22-1-pve version 5.11.22-2 > > cheers, > Thomas > > > [0]: > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=a019abd80220 _______________________________________________ pve-user mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
