[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 --- Comment #5 from Kristof Provost --- > Kristof, do you know the code well enough to say if it would be possible to > deny the initial interface rename action if a parent vnet is using the same > name? That runs into the same problems as dealing with it when the interface is returned to the parent vnet, and doesn't account for possible renames after the interface is moved to a child vnet. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 --- Comment #4 from Thomas Steen Rasmussen / Tykling --- (In reply to Kristof Provost from comment #3) Thank you for the input. The issue I was hitting is the first one you mention - also described in #185619 - and I've been able to work around it in my own setup by inventing some interface names inside the jails which are never used on the host (in my case the jail interfaces are called jail0, jail1 etc). Also, this is not strictly needed, but one could add an exec.stop entry before rc.shutdown to rename the interfaces back to their original epairNb name which shouldn't be in use in the parent vnet. Both of these are workarounds of course, and doesn't begin to consider nested jails with overlapping interface names. Kristof, do you know the code well enough to say if it would be possible to deny the initial interface rename action if a parent vnet is using the same name? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 --- Comment #3 from Kristof Provost --- With the disclaimer that this is entirely from memory and may be incorrect or outdated: I'm aware of several somewhat related issues around interface naming. One is this, that when an interface is moved between vnets (e.g. when the jail it lives in goes away) there's no check for name collisions. That's non-trivial to solve, because the relevant code paths often have no ability to return errors if there's a name collision and the locking around interface names is also unclear (and likely wrong in several places). There's a loosely related issue with interface groups as well (see #218895, #202178). Now that interfaces can be renamed it's possible to have an interface group and an interface with the same name (and the interface need not even be a member of the group). This has previously triggered panics in pf, as it assumes that interfaces and interface groups share a namespace (and this was historically the case, in that interfaces always ended with a number and groups never did. The former is no longer the case, but the latter is still enforced). This issue too is difficult to solve for the same reasons as the problem described in this bug (lack of error paths, unclear locking). When I looked at it last I estimated this to be a significant (plausibly multi-month) effort to fix. I do not expect to work on these problems any time soon. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 --- Comment #2 from Thomas Steen Rasmussen / Tykling --- This statement - Rebooting with four jails plus the above ruleset enabled means never getting any contact to the server at all (ie. the problem manifests from boot). is not true, my testing was off. The problem only shows up when vnet jails with the same interface names as on the host are stopped/restarted. This also explains why I had such a hard time reproducing it right after a reboot. It only happens when a jail has been started and is then stopped (or restarted) This fits the problem description in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619 perfectly -- You are receiving this mail because: You are the assignee for the bug.
[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 --- Comment #1 from Thomas Steen Rasmussen / Tykling --- Maybe related https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619 Also, I forgot to mention, at some point yesterday while trying 100 things I saw the em0 on the host having multiple ether and hwaddr entries, the mac addresses were like the ones you see on epair interfaces. I have a screenshot of it if anyone is interested. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973 Bug ID: 260973 Summary: pf: firewall rules stop matching when vnet jails share interface names with the host Product: Base System Version: 13.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: b...@freebsd.org Reporter: tho...@gibfest.dk Hello, I've been building a new vnet jailhost on 13 and I am hitting a weird issue where pf stops permitting traffic it clearly has rules to allow after interfaces inside vnet jails are renamed to the same name as the host interface with the pf rule. This is on FreeBSD nuc1.servers.bornhack.org 13.0-STABLE FreeBSD 13.0-STABLE #1 stable/13-d208638c5: Wed Jan 5 13:32:08 UTC 2022 r...@nuc1.servers.bornhack.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 The complete ruleset is pretty complex but I've managed to cook it down to a few lines: [tykling@nuc1 ~]$ cat testpf.conf block log all set skip on lo0 pass in quick on { em0 } proto { tcp } from { 85.235.250.87 } to { (em0) } port { 22 } [tykling@nuc1 ~]$ The host has an em0 interface: [tykling@nuc1 ~]$ ifconfig em0 em0: flags=8863 metric 0 mtu 1500 options=481049b ether 1c:69:7a:ab:fe:be inet 85.209.118.130/28 broadcast 85.209.118.143 inet6 fe80::1e69:7aff:feab:febe%em0/64 scopeid 0x1 inet6 2a09:94c4:55d1:7680::82/64 media: Ethernet autoselect (1000baseT ) status: active nd6 options=21 [tykling@nuc1 ~]$ The issue seems to be triggered by renaming epair interfaces inside vnet jails to the same name as an interface on the host. The above pf ruleset works and keeps working if I don't start any vnet jails. It also keeps working if I start vnet jails but don't rename interfaces. It also keeps working if I start vnet jails but rename the interfaces to something other than em0. Existing states established before the issue happens keep working (I am working remote via ssh on the server), but new states seem to just ignore the permit rule on em0, and the traffic gets blocked even though a rule should permit it: 06:08:46.357935 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799486870 ecr 0], length 0 06:08:47.358590 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799487870 ecr 0], length 0 06:08:49.557897 rule 0/0(match): block in on em0: 85.235.250.87.40108 > 85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 799490070 ecr 0], length 0 A wild guess as to the reason might be a race leading to some confusion over which em0 interface is which? Some more observations: - It didn't seem to happen with just one vnet jail when I tried narrowing it down. Enabling and starting three more made the problem occur almost instantly. - Rebooting with four jails plus the above ruleset enabled means never getting any contact to the server at all (ie. the problem manifests from boot). - Results with two jails were less consistent. The number of jails/interface renames seem to play a role in whether or not the issue is triggered. - A "service jail restart" will trigger it almost instantly if it doesn't happen right away. - Renaming interfaces to something other than "em0" also works without any issues. I hope reproducing will be possible, I've included the jail.conf file for one of the jails below: [tykling@nuc1 ~]$ cat /var/run/jail.syslog1_servers_bornhack_org.conf # Generated by rc.d/jail at 2022-01-06 08:19:08 syslog1_servers_bornhack_org { host.hostname = "syslog1.servers.bornhack.org"; path = "/usr/jails/syslog1.servers.bornhack.org"; vnet; vnet.interface = "epair2b"; exec.clean; exec.system_user = "root"; exec.jail_user = "root"; exec.prestart += "ifconfig epair2a destroy 2>/dev/null || true && ifconfig epair2 create up && ifconfig epair2a up && ifconfig bridge1 addm epair2a up"; exec.start += "/sbin/ifconfig epair2b name em0 && ifconfig em0 10.1.0.3/24 && ifconfig em0 inet6 2a09:94c4:55d1:76A0::3/64"; exec.start += "route add -inet default 10.1.0.1"; exec.start += "route add -inet6 default 2a09:94c4:55d1:76A0::1"; exec.poststop += "ifconfig bridge1 deletem epair2a && ifconfig epair2a destroy"; exec.start += "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown jail"; exec.consolelog = "/var/log/jail_syslog1_servers_bornhack_org_console.log"; mount.fstab = "/etc/fstab.syslog1_servers_bornhack_org"; allow.set_hostname = 0; allow.sysvipc = 0; enforce_statfs = "2"; } [tykling@n