[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-02-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

--- Comment #5 from Kristof Provost  ---
> Kristof, do you know the code well enough to say if it would be possible to 
> deny the initial interface rename action if a parent vnet is using the same 
> name?


That runs into the same problems as dealing with it when the interface is
returned to the parent vnet, and doesn't account for possible renames after the
interface is moved to a child vnet.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-02-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

--- Comment #4 from Thomas Steen Rasmussen / Tykling  ---
(In reply to Kristof Provost from comment #3)

Thank you for the input. The issue I was hitting is the first one you mention -
also described in #185619 - and I've been able to work around it in my own
setup by inventing some interface names inside the jails which are never used
on the host (in my case the jail interfaces are called jail0, jail1 etc).

Also, this is not strictly needed, but one could add an exec.stop entry before
rc.shutdown to rename the interfaces back to their original epairNb name which
shouldn't be in use in the parent vnet.

Both of these are workarounds of course, and doesn't begin to consider nested
jails with overlapping interface names.

Kristof, do you know the code well enough to say if it would be possible to
deny the initial interface rename action if a parent vnet is using the same
name?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-02-14 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

--- Comment #3 from Kristof Provost  ---
With the disclaimer that this is entirely from memory and may be incorrect or
outdated:

I'm aware of several somewhat related issues around interface naming. One is
this, that when an interface is moved between vnets (e.g. when the jail it
lives in goes away) there's no check for name collisions.
That's non-trivial to solve, because the relevant code paths often have no
ability to return errors if there's a name collision and the locking around
interface names is also unclear (and likely wrong in several places).

There's a loosely related issue with interface groups as well (see #218895,
#202178). Now that interfaces can be renamed it's possible to have an interface
group and an interface with the same name (and the interface need not even be a
member of the group). This has previously triggered panics in pf, as it assumes
that interfaces and interface groups share a namespace (and this was
historically the case, in that interfaces always ended with a number and groups
never did. The former is no longer the case, but the latter is still enforced).
This issue too is difficult to solve for the same reasons as the problem
described in this bug (lack of error paths, unclear locking).

When I looked at it last I estimated this to be a significant (plausibly
multi-month) effort to fix. I do not expect to work on these problems any time
soon.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-01-06 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

--- Comment #2 from Thomas Steen Rasmussen / Tykling  ---
This statement

- Rebooting with four jails plus the above ruleset enabled means never getting
any contact to the server at all (ie. the problem manifests from boot).

is not true, my testing was off. The problem only shows up when vnet jails with
the same interface names as on the host are stopped/restarted. This also
explains why I had such a hard time reproducing it right after a reboot. It
only happens when a jail has been started and is then stopped (or restarted)

This fits the problem description in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619 perfectly

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-01-06 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

--- Comment #1 from Thomas Steen Rasmussen / Tykling  ---
Maybe related https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185619

Also, I forgot to mention, at some point yesterday while trying 100 things I
saw the em0 on the host having multiple ether and hwaddr entries, the mac
addresses were like the ones you see on epair interfaces. I have a screenshot
of it if anyone is interested.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

2022-01-06 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

Bug ID: 260973
   Summary: pf: firewall rules stop matching when vnet jails share
interface names with the host
   Product: Base System
   Version: 13.0-STABLE
  Hardware: Any
OS: Any
Status: New
  Severity: Affects Some People
  Priority: ---
 Component: kern
  Assignee: b...@freebsd.org
  Reporter: tho...@gibfest.dk

Hello,

I've been building a new vnet jailhost on 13 and I am hitting a weird issue
where pf stops permitting traffic it clearly has rules to allow after
interfaces inside vnet jails are renamed to the same name as the host interface
with the pf rule.

This is on FreeBSD nuc1.servers.bornhack.org 13.0-STABLE FreeBSD 13.0-STABLE #1
stable/13-d208638c5: Wed Jan  5 13:32:08 UTC 2022
r...@nuc1.servers.bornhack.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

The complete ruleset is pretty complex but I've managed to cook it down to a
few lines:

[tykling@nuc1 ~]$ cat testpf.conf 
block log all
set skip on lo0
pass in quick on { em0 } proto { tcp } from { 85.235.250.87 } to { (em0) } port
{ 22 }
[tykling@nuc1 ~]$ 

The host has an em0 interface:

[tykling@nuc1 ~]$ ifconfig em0
em0: flags=8863 metric 0 mtu 1500
   
options=481049b
ether 1c:69:7a:ab:fe:be
inet 85.209.118.130/28 broadcast 85.209.118.143
inet6 fe80::1e69:7aff:feab:febe%em0/64 scopeid 0x1
inet6 2a09:94c4:55d1:7680::82/64
media: Ethernet autoselect (1000baseT )
status: active
nd6 options=21
[tykling@nuc1 ~]$ 

The issue seems to be triggered by renaming epair interfaces inside vnet jails
to the same name as an interface on the host.

The above pf ruleset works and keeps working if I don't start any vnet jails.
It also keeps working if I start vnet jails but don't rename interfaces. It
also keeps working if I start vnet jails but rename the interfaces to something
other than em0.

Existing states established before the issue happens keep working (I am working
remote via ssh on the server), but new states seem to just ignore the permit
rule on em0, and the traffic gets blocked even though a rule should permit it:

06:08:46.357935 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799486870 ecr 0], length 0
06:08:47.358590 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799487870 ecr 0], length 0
06:08:49.557897 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799490070 ecr 0], length 0

A wild guess as to the reason might be a race leading to some confusion over
which em0 interface is which?

Some more observations:
- It didn't seem to happen with just one vnet jail when I tried narrowing it
down. Enabling and starting three more made the problem occur almost instantly.
- Rebooting with four jails plus the above ruleset enabled means never getting
any contact to the server at all (ie. the problem manifests from boot).
- Results with two jails were less consistent. The number of jails/interface
renames seem to play a role in whether or not the issue is triggered.
- A "service jail restart" will trigger it almost instantly if it doesn't
happen right away.
- Renaming interfaces to something other than "em0" also works without any
issues.

I hope reproducing will be possible, I've included the jail.conf file for one
of the jails below:

[tykling@nuc1 ~]$ cat /var/run/jail.syslog1_servers_bornhack_org.conf
# Generated by rc.d/jail at 2022-01-06 08:19:08
syslog1_servers_bornhack_org {
host.hostname = "syslog1.servers.bornhack.org";
path = "/usr/jails/syslog1.servers.bornhack.org";
vnet;
vnet.interface = "epair2b";
exec.clean;
exec.system_user = "root";
exec.jail_user = "root";
exec.prestart += "ifconfig epair2a destroy 2>/dev/null || true &&
ifconfig epair2 create up && ifconfig epair2a up && ifconfig bridge1 addm
epair2a up";
exec.start += "/sbin/ifconfig epair2b name em0 && ifconfig em0
10.1.0.3/24 && ifconfig em0 inet6 2a09:94c4:55d1:76A0::3/64";
exec.start += "route add -inet default 10.1.0.1";
exec.start += "route add -inet6 default 2a09:94c4:55d1:76A0::1";
exec.poststop += "ifconfig bridge1 deletem epair2a && ifconfig epair2a
destroy";
exec.start += "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown jail";
exec.consolelog =
"/var/log/jail_syslog1_servers_bornhack_org_console.log";
mount.fstab = "/etc/fstab.syslog1_servers_bornhack_org";
allow.set_hostname = 0;
allow.sysvipc = 0;
enforce_statfs = "2";
}
[tykling@n