Re: [External] : Re: pf route-to issues
On 2021/01/26 09:29, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > > > > I'm not sure if proposed scenario real. Let's assume there > > > is a PF box with three NICs running on this awkward set up > > > > > > em1 ... 192.168.1.10 > > > > > > em0 > > > > > > em2 ... 192.168.1.10 > > > > > > em0 is attached to LAN em1 and em2 are facing to internet which is > > > reachable with two different physical lines. both lines are connected > > > via > > > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > > > no way to change that. > > > > in this scenario are em1 and em2 connected to the same ethernet > > segment and ip subnet? > > em1 and em2 are connected to different wires (broadcast domains). > > and I agree this set up is really awkward. not sure if it will work > because it also depends on how ARP table is organized. I think it > works on Solaris, but I have not tried that with OpenBSD. I don't know about ethernet/ARP, but that does work with pppoe. FWIW this is how I was doing things. (I'm not doing this any more) PRIO_LINES="81.187.8.161@pppoe4 81.187.218.235@pppoe5" PRIO_LINES_V6="fe80::@pppoe4 fe80::@pppoe5" .. pass in log quick on {lan, wifi} proto udp to 193.35.128.0/20 port {500, 4500} tag uma-orange route-to {$PRIO_LINES} pass in log quick on {lan, wifi} proto {esp, ah} to 193.35.128.0/20 tag uma-orange route-to {$PRIO_LINES} pass out log quick on aaisp tagged uma-orange nat-to 81.187.41.148/30 random sticky-address static-port pass in quick inet proto {tcp,udp} from {(self), 10.71.12.5, 192.168.10.5} to !10/8 \ port domain tag priority-dns route-to {$PRIO_LINES} pass in quick inet6 proto {tcp,udp} from {(self), 2001:8b0:6412:11::5} to !2001:8b0:6412::/48 \ port domain tag priority-dns route-to {$PRIO_LINES_V6} pass out on aaisp tagged priority-dns .. In these I used the "my side" address on the pppoe interfaces because the "ISP side" address was always 81.187.81.187. AFAIK PF didn't care about the actual address when it was point-to-point and just a dummy address worked. (Same with the standard route table too, but that got changed, I think maybe with ART). > > the other is to add add routes "beyond" 192.168.1.10 and route-to > > them: > > > > # route add 127.0.1.10 192.168.1.10 -ifp em1 > > # route add 127.0.2.10 192.168.1.10 -ifp em2 > > > > then you can write rules like this: > > > > pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 > > pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 That sounds like a usable hack. Maybe even better than before (this route should go down with the interface, so the traffic wouldn't end up blackholed like it did with route-to ip@iface). I think that's easier to deal with than rtables if you have incoming traffic to services running on the router itself. > yes I understand that. I'm not able to judge how many currently working > configurations will remain broken after we kill 'address@interface' form. > I'm sure many deployments will get fixed with simple tweak: > echo 'address@interface'|cut -d @ -f 1 standard ethernet: address@em0 -> address ppp, different isps:address@pppoe0 -> (pppoe0:peer) ppp with same isps, or conflicting ethernet: rtable or "routes beyond" hack dhcp, rtadv:ETOOHARD, don't think it worked before without scripted pf.conf changes anyway. use rtable.
Re: [External] : Re: pf route-to issues
Hello, On Tue, Jan 26, 2021 at 12:33:25PM +0100, Alexander Bluhm wrote: > On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > > > But what about dup-to? The packet is duplicated for both directions. > > > I guess the main use case for dup-to is implementing a monitor port. > > > There you have to pass packets stateless, otherwise it would not > > > work anyway. The strange semantics is not related to this diff. > > > > are you saying i should skip pf_test for all dup-to generated packets? > > I am not sure. > > When we have an in dup-to rule, the incoming packets in request > direction are dupped and tested with the out ruleset. The reply > packets for this state are also dupped, but not tested when they > leave the dup interface. > > This is inconsistent and cannot work statefully. Stateful filtering > with dupped packets does not make sense anyway. The only working > config is "pass out on dup-interface no state". > > Do we think this rule should be required? > > 1. No packet should leave an interface without a rule. > > if (pd->dir == PF_IN || s->rt == PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > 2. The config says we want a monitor port. We risk that the >original packet and the dupped packet match the same rule. >Stateful filtering cannot work, we do not expect reply packets >for the dups. > > if (pd->dir == PF_IN && s->rt != PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > 3. Some sort of problem was there before, but different. Don't >address it now. > another option would be to mark duped packet as GENERATED to short circuit pf_test() here: 6871 6872 if ((*m0)->m_pkthdr.pf.flags & PF_TAG_GENERATED) 6873 return (PF_PASS); 6874 perhaps excluding those changes from current diff is good. this seems to be separate issue anyway. thanks and regards sashan
Re: pf route-to issues
On Tue, Jan 26, 2021 at 12:33:25PM +0100, Alexander Bluhm wrote: > On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > > > But what about dup-to? The packet is duplicated for both directions. > > > I guess the main use case for dup-to is implementing a monitor port. > > > There you have to pass packets stateless, otherwise it would not > > > work anyway. The strange semantics is not related to this diff. > > > > are you saying i should skip pf_test for all dup-to generated packets? > > I am not sure. > > When we have an in dup-to rule, the incoming packets in request > direction are dupped and tested with the out ruleset. The reply > packets for this state are also dupped, but not tested when they > leave the dup interface. > > This is inconsistent and cannot work statefully. Stateful filtering > with dupped packets does not make sense anyway. The only working > config is "pass out on dup-interface no state". > > Do we think this rule should be required? dup-to is tricky. In general you should run the collector on its own interface and then 'set skip on $dupif' could work. I would assume that the copy of the packet is bypassing pf and is just sent directly without hitting pf_test again. At least for route-to & reply-to I would expect this behaviour (like I do not expect that I need an extra pass rule to allow a rdr-to through). > 1. No packet should leave an interface without a rule. > > if (pd->dir == PF_IN || s->rt == PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > 2. The config says we want a monitor port. We risk that the >original packet and the dupped packet match the same rule. >Stateful filtering cannot work, we do not expect reply packets >for the dups. > > if (pd->dir == PF_IN && s->rt != PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) I guess this is for the case where route-to is used in PF_IN. I agree this should be done so that the state table is properly set. Skipping this for the copy of dup-to packets makes sense. Running pf_test() for the same mbuf and direction but with different ifp is causing more harm the good. > 3. Some sort of problem was there before, but different. Don't >address it now. > > Maybe 2 has less impact for the users and is easy to understand. > We should document that in the man page. > > > > We are reaching a state where this diff can go in. I just startet > > > a regress run with it. OK bluhm@ > > > > hopefully i fixed the pfctl error messages up so the regress tests arent > > too unhappy. > > pf forward and pf fragment tests pass. They include route-to and > reply-to rules. I have no test for dup-to. Regress pfctl fails, > but I think dlg@ has a diff for that. > > bluhm > -- :wq Claudio
Re: pf route-to issues
On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > > But what about dup-to? The packet is duplicated for both directions. > > I guess the main use case for dup-to is implementing a monitor port. > > There you have to pass packets stateless, otherwise it would not > > work anyway. The strange semantics is not related to this diff. > > are you saying i should skip pf_test for all dup-to generated packets? I am not sure. When we have an in dup-to rule, the incoming packets in request direction are dupped and tested with the out ruleset. The reply packets for this state are also dupped, but not tested when they leave the dup interface. This is inconsistent and cannot work statefully. Stateful filtering with dupped packets does not make sense anyway. The only working config is "pass out on dup-interface no state". Do we think this rule should be required? 1. No packet should leave an interface without a rule. if (pd->dir == PF_IN || s->rt == PF_DUPTO) { if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) 2. The config says we want a monitor port. We risk that the original packet and the dupped packet match the same rule. Stateful filtering cannot work, we do not expect reply packets for the dups. if (pd->dir == PF_IN && s->rt != PF_DUPTO) { if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) 3. Some sort of problem was there before, but different. Don't address it now. Maybe 2 has less impact for the users and is easy to understand. We should document that in the man page. > > We are reaching a state where this diff can go in. I just startet > > a regress run with it. OK bluhm@ > > hopefully i fixed the pfctl error messages up so the regress tests arent > too unhappy. pf forward and pf fragment tests pass. They include route-to and reply-to rules. I have no test for dup-to. Regress pfctl fails, but I think dlg@ has a diff for that. bluhm
Re: [External] : Re: pf route-to issues
Hello, On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > On Mon, Jan 25, 2021 at 04:19:11PM +0100, Alexander Bluhm wrote: > > On Fri, Jan 22, 2021 at 06:07:59PM +1000, David Gwynne wrote: > > > --- sys/conf/GENERIC 30 Sep 2020 14:51:17 - 1.273 > > > +++ sys/conf/GENERIC 22 Jan 2021 07:33:30 - > > > @@ -82,6 +82,7 @@ pseudo-device msts1 # MSTS line discipl > > > pseudo-deviceendrun 1 # EndRun line discipline > > > pseudo-devicevnd 4 # vnode disk devices > > > pseudo-deviceksyms 1 # kernel symbols device > > > +pseudo-devicekstat > > > #pseudo-device dt # Dynamic Tracer > > > > > > # clonable devices > > > > This is an unrelated chunk. > > oh yeah... > > > > +pf_route(struct pf_pdesc *pd, struct pf_state *s) > > ... > > > + if (pd->dir == PF_IN) { > > > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > > > Yes, this is the correct logic. When the packet comes in, pf > > overrides forwarding, tests the out rules, and sends it. For > > outgoing packets on out route-to rules we have already tested the > > rules. It also works for reply-to the other way around. > > yep. I'm suggesting to keep current if () in. and change it with follow up commit, which will adjust match rule for route-to/reply-to. would it be OK with you, David? > > > But what about dup-to? The packet is duplicated for both directions. > > I guess the main use case for dup-to is implementing a monitor port. > > There you have to pass packets stateless, otherwise it would not > > work anyway. The strange semantics is not related to this diff. > > are you saying i should skip pf_test for all dup-to generated packets? this makes perfect sense for me. > > > We are reaching a state where this diff can go in. I just startet > > a regress run with it. OK bluhm@ > > hopefully i fixed the pfctl error messages up so the regress tests arent > too unhappy. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > > goto bad; > > > > here we do 'goto bad', which does not call if_put(). > > yes it does. the whole chunk with the diff applied is: > > done: > if (s->rt != PF_DUPTO) > pd->m = NULL; > if_put(ifp); > rtfree(rt); > return; > > bad: > m_freem(m0); > goto done; > } > > bad drops the mbuf and then goes to done. yes you are absolutely right, I need glasses. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > i'll need help with "match on em0 route-to $if_em0_peer". or we can do > that later separately? may be can we keep this line in pf_route() untouched at least for now: 6041 6042 if (pd->kif->pfik_ifp != ifp) { 6043 if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) 6044 goto bad; 6045 else if (m0 == NULL) 6046 goto done; 6047 if (m0->m_len < sizeof(struct ip)) { 6048 DPFPRINTF(LOG_ERR, 6049 "%s: m0->m_len < sizeof(struct ip)", __func__); 6050 goto bad; 6051 } 6052 ip = mtod(m0, struct ip *); 6053 } 6054 I think if () at line 6042 does not hurt pfsync(4). This should be removed with commit, which will introduce the 'match ... route-to'. There should be more detailed explanation in my response to email from bluhm@. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > > > > > I'm not sure if proposed scenario real. Let's assume there > > is a PF box with three NICs running on this awkward set up > > > > em1 ... 192.168.1.10 > > > > em0 > > > > em2 ... 192.168.1.10 > > > > em0 is attached to LAN em1 and em2 are facing to internet which is > > reachable with two different physical lines. both lines are connected > > via > > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > > no way to change that. > > in this scenario are em1 and em2 connected to the same ethernet > segment and ip subnet? em1 and em2 are connected to different wires (broadcast domains). and I agree this set up is really awkward. not sure if it will work because it also depends on how ARP table is organized. I think it works on Solaris, but I have not tried that with OpenBSD. > > > the 'address@interface' syntax is the only way to define rules: > > > > pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 > > pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 > > > > regardless of how much real such scenario is I believe it can > > currently work. > > this is a very awkward configuration. while i think what it's trying > to do is useful, how it is expressed relies on something i want to > break (fix?). yes I completely agree. I believe this 'awkward thinking' stems back to ipf (ipfilter). I found my ancient notes to find some details on this. My notes refer to ipf.conf(5) manpage [1]. If I understand the purpose of 'address@interface' syntax right, it allows us to forward packets using explicit pair [ next-hop IP, interface ]. Thus firewall can emulate a route, which might be missing in routing table. It just 'blindly' sends matching packet to next-hop using desired interface assuming that 'next-hop' host will do the right thing with such packet. > > one of the original reasons i wanted to break this kind of config > is because pfsync hasn't got a way to exchange interace information, > and different firewalls are going to have different interface > topologies anyway. one of the reasons to only use a destination/next > hop as the argument to route-to rules was so pfsync would work. yes I understand that. I'm not able to judge how many currently working configurations will remain broken after we kill 'address@interface' form. I'm sure many deployments will get fixed with simple tweak: echo 'address@interface'|cut -d @ -f 1 but I'm not sure how many people do still need to order desired NIC. > > i'm pretty sure this is broken at the moment because of bugs in the > routing code. it is possible to configure routes to 192.168.1.10 > via both em1 and em2 if net.inet.ip.multipath is set to 1, but im > sure the llinfo (arp and rtable) part of this kind of multipath > route setup does not work reliably. i guess i should try and get > my fixes for this into the tree. > > there are two alternate ways i can think of to do this. the first > is to configure an rtable for each interface: > > # route -T 1 add default 192.168.1.10 -ifp em1 > # route -T 2 add default 192.168.1.10 -ifp em2 > > then you could write rules like this: > > pass in on em0 from 172.16.0.0/16 rtable 1 > pass in on em0 from 172.17.0.0/16 rtable 2 > > the other is to add add routes "beyond" 192.168.1.10 and route-to > them: > > # route add 127.0.1.10 192.168.1.10 -ifp em1 > # route add 127.0.2.10 192.168.1.10 -ifp em2 > > then you can write rules like this: > > pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 > pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 > > this will likely hit the same bugs in the rtable/arp code i referred > to above though. > > also, note that i haven't tested either of these. > I think this sounds like a plan how to deal with edge cases, which can't get fixed with simple 'cut -d @ -f 1' thanks and regards sashan [1] https://www.freebsd.org/cgi/man.cgi?query=ipf.conf&sektion=5&apropos=0 (search for reply-to)
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 05:38:40PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > > Hi, > > > > Some personal thoughts. I am happy when pf route-to gets simpler. > > Especially I have never understood what this address@interface > > syntax is used for. > > > > I cannot estimate what configuration is used by our cutomers in > > many installations. Simple syntax change address@interface -> > > address of next hob should be no problem. Slight semantic changes > > have to be dealt with. Current packet flow is complicated and may > > be inspired by old NAT behavior. As long it becomes more sane and > > easier to understand, we should change it. > > > I'm not sure if proposed scenario real. Let's assume there > is a PF box with three NICs running on this awkward set up > > em1 ... 192.168.1.10 > > em0 > > em2 ... 192.168.1.10 > > em0 is attached to LAN em1 and em2 are facing to internet which is > reachable with two different physical lines. both lines are connected via > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > no way to change that. in this scenario are em1 and em2 connected to the same ethernet segment and ip subnet? > the 'address@interface' syntax is the only way to define rules: > > pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 > pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 > > regardless of how much real such scenario is I believe it can > currently work. this is a very awkward configuration. while i think what it's trying to do is useful, how it is expressed relies on something i want to break (fix?). one of the original reasons i wanted to break this kind of config is because pfsync hasn't got a way to exchange interace information, and different firewalls are going to have different interface topologies anyway. one of the reasons to only use a destination/next hop as the argument to route-to rules was so pfsync would work. i'm pretty sure this is broken at the moment because of bugs in the routing code. it is possible to configure routes to 192.168.1.10 via both em1 and em2 if net.inet.ip.multipath is set to 1, but im sure the llinfo (arp and rtable) part of this kind of multipath route setup does not work reliably. i guess i should try and get my fixes for this into the tree. there are two alternate ways i can think of to do this. the first is to configure an rtable for each interface: # route -T 1 add default 192.168.1.10 -ifp em1 # route -T 2 add default 192.168.1.10 -ifp em2 then you could write rules like this: pass in on em0 from 172.16.0.0/16 rtable 1 pass in on em0 from 172.17.0.0/16 rtable 2 the other is to add add routes "beyond" 192.168.1.10 and route-to them: # route add 127.0.1.10 192.168.1.10 -ifp em1 # route add 127.0.2.10 192.168.1.10 -ifp em2 then you can write rules like this: pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 this will likely hit the same bugs in the rtable/arp code i referred to above though. also, note that i haven't tested either of these. cheers, dlg
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 06:17:02PM +0100, Alexandr Nedvedicky wrote: > Hello, > > pf_route() might leak a refence to ifp. oh no :( > > Index: sys/net/pf.c > > === > > RCS file: /cvs/src/sys/net/pf.c,v > > retrieving revision 1.1101 > > diff -u -p -r1.1101 pf.c > > --- sys/net/pf.c19 Jan 2021 22:22:23 - 1.1101 > > +++ sys/net/pf.c22 Jan 2021 07:33:31 - > > > > > @@ -5998,48 +5994,40 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > > > ip = mtod(m0, struct ip *); > > > > > + > > + ifp = if_get(rt->rt_ifidx); > > if (ifp == NULL) > > goto bad; > > here we get a reference to ifp. yep. > > @@ -6082,9 +6060,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > */ > > if (ip->ip_off & htons(IP_DF)) { > > ipstat_inc(ips_cantfrag); > > - if (r->rt != PF_DUPTO) > > + if (s->rt != PF_DUPTO) > > pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, > > - ifp->if_mtu, pd->af, r, pd->rdomain); > > + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); > > goto bad; > > here we do 'goto bad', which does not call if_put(). yes it does. the whole chunk with the diff applied is: done: if (s->rt != PF_DUPTO) pd->m = NULL; if_put(ifp); rtfree(rt); return; bad: m_freem(m0); goto done; } bad drops the mbuf and then goes to done.
Re: pf route-to issues
On Mon, Jan 25, 2021 at 04:19:11PM +0100, Alexander Bluhm wrote: > On Fri, Jan 22, 2021 at 06:07:59PM +1000, David Gwynne wrote: > > --- sys/conf/GENERIC30 Sep 2020 14:51:17 - 1.273 > > +++ sys/conf/GENERIC22 Jan 2021 07:33:30 - > > @@ -82,6 +82,7 @@ pseudo-device msts1 # MSTS line discipl > > pseudo-device endrun 1 # EndRun line discipline > > pseudo-device vnd 4 # vnode disk devices > > pseudo-device ksyms 1 # kernel symbols device > > +pseudo-device kstat > > #pseudo-device dt # Dynamic Tracer > > > > # clonable devices > > This is an unrelated chunk. oh yeah... > > +pf_route(struct pf_pdesc *pd, struct pf_state *s) > ... > > + if (pd->dir == PF_IN) { > > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > Yes, this is the correct logic. When the packet comes in, pf > overrides forwarding, tests the out rules, and sends it. For > outgoing packets on out route-to rules we have already tested the > rules. It also works for reply-to the other way around. yep. > But what about dup-to? The packet is duplicated for both directions. > I guess the main use case for dup-to is implementing a monitor port. > There you have to pass packets stateless, otherwise it would not > work anyway. The strange semantics is not related to this diff. are you saying i should skip pf_test for all dup-to generated packets? > We are reaching a state where this diff can go in. I just startet > a regress run with it. OK bluhm@ hopefully i fixed the pfctl error messages up so the regress tests arent too unhappy.
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > Hi, > > Some personal thoughts. I am happy when pf route-to gets simpler. > Especially I have never understood what this address@interface > syntax is used for. even after staring at it for so long, i still don't get it. i do think it was a reimplementation of an ipfilter thing, but i don't think it makes a lot of sense in ipfilter either. > I cannot estimate what configuration is used by our cutomers in > many installations. Simple syntax change address@interface -> > address of next hob should be no problem. Slight semantic changes > have to be dealt with. Current packet flow is complicated and may > be inspired by old NAT behavior. As long it becomes more sane and > easier to understand, we should change it. route-to $destination, not $next_hop... the biggest change we have to agree on at the moment is whether we're changing the semantic of "pf runs when a packet goes over an interface" to "pf runs when a packet goes in or out of the stack". this affects whether pf_test runs again when route-to changes the interface. > But I don't like artificial restrictions. We don't know all use > cases. reply-to and route-to could be used for both in and out > rules. I have used them for strange divert-to on bridge setups. > It should stay that way. i don't think it's complicated to support route-to and reply-to on both in and out rules. we've already found that there's use cases for reply-to on inbound rules, doing things on bridges just adds to that. it could be used on tpmr(4) too... > It would be nice to keep state-less route-to. I have found a special > case with that in the code of our product. But it looks like dead > code, so I would not object to remove state-less route-to for now. ok. thank you. > bluhm
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 02:30:46PM +0100, Alexandr Nedvedicky wrote: > Hello, > > ... > > > > i dont understand the kif lifetimes though. can we just point a > > pdesc at an arbitrary kif or do we need ot reference count? > > > > as long as we don't release NET_LOCK() (or PF_LOCK() in near future), > the reference to kif should remain valid. > > kif is very own PF's abstraction of network interfaces. there are > two ways how one can create a kif: > > there is either rule loaded, which refers to interface, which > is not plumbed yet. > > the new interface gets plumbed and PF must have a kif for it. > > the reference count currently used makes sure we destroy kif when > either interface is gone or when the rule, which refers kif is > gone. > > hope I remember details above right. ok. sounds simple enough... > > > I think this might be way to go. > > > > > > My only concern now is that such change is subtle. I mean the > > > > > > pass out ... route-to > > > > > > will change behavior in sense that current code will dispatch > > > packet to new interface and run pf_test() again. Once your diff > > > will be in the same rule will be accepted, but will bring entirely > > > different behaviour: just dispatch packet to new interface. > > > > yeah. > > > > the counter example is that i was extremely surprised when i > > discovered that pf_test gets run again when the outgoing interface > > is changed with a route-to rule. > > surprised because you've forgot about current model? which is: > run pf_test() whenever packet crosses the interface? forgot isn't the right word. i was in the room when henning and mcbride reworked pf and came up with the stack and wire side key terminology and semantics, and i've spent a lot of time looking at the ip input and output code where pf_test is called close to the stack. it just wasn't obvious to me that pf filtered over an interface rather than filtered in and out of the stack. apart from route-to in pf, i'm not sure it is a meaningful difference either. > > there's subtlety either way, we're just figuring out which one we're > > going for. > > yes exactly. there are trade offs either way. > > > > > > I think this is acceptable. If this will cause a friction we can > > > always > > > adjust the code in follow up commit to allow state-less > > > route-to/reply-to > > > with no support from pfsync(4). > > > > if we're going to support route-to on match rules i think this will be > > easy to implement. > > > > I think there must be some broader consent on model change > from current which is run pf_test() for every NIC crossing > to new way, which is run pf_test() at most two times. agreed. > > > > > > lastly, the "argument" or address specified with route-to (and > > > > > > reply-to and dup-to) is a destination address, not a next-hop. this > > > > > > has been discussed on the lists a couple of times before, so i won't > > > > > > go over it again, except to reiterate that it allows pf to force > > > > > > "sticky" path selection while opening up the possibility for ecmp > > > > > > and failover for where that path traverses. > > > > > > > > > > I keep forgetting about it as I still stick to current > > > > > interpretation. > > > > > > > > > > > > > > > I've seen changes to pfctl. Diff below still allows rule: > > > > > > > > > > pass in on net0 from 192.168.1.0/24 to any route-to > > > > > 10.10.10.10@em0 > > > > > > > > Is there use case for the @interface syntax apart from the current > > > > route-to rules? If not, we can just delete it. > > > > > > perhaps I'm still not quite on the same page as you then. I also > > > had no time to entirely test you diff. The way I understand your > > > effort is to change route-to behavior such it will be using > > > a destination instead of next-hop@interface. Or are you planning > > > to keep current form ('route-to next-hop@interface') working? > > > > if we ignore route-to, what's the use case for the interface part of > > address@interface? it doesnt seem to be accepted as part of an address > > in other parts of the grammar: > > > > dlg@kbuild ~$ echo pass in from 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if syntax not permitted in from or to > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass from 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if syntax not permitted in from or to > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if not permitted > > stdin:1: nat-to and rdr-to require a direction > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfc
Re: [External] : Re: pf route-to issues
Hello, pf_route() might leak a refence to ifp. > Index: sys/net/pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1101 > diff -u -p -r1.1101 pf.c > --- sys/net/pf.c 19 Jan 2021 22:22:23 - 1.1101 > +++ sys/net/pf.c 22 Jan 2021 07:33:31 - > @@ -5998,48 +5994,40 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > ip = mtod(m0, struct ip *); > > + > + ifp = if_get(rt->rt_ifidx); > if (ifp == NULL) > goto bad; here we get a reference to ifp. > > - if (pd->kif->pfik_ifp != ifp) { > + /* A locally generated packet may have invalid source address. */ > + if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && > + (ifp->if_flags & IFF_LOOPBACK) == 0) > + ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; > + > + if (pd->dir == PF_IN) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > goto bad; > else if (m0 == NULL) > @@ -6052,16 +6040,6 @@ pf_route(struct pf_pdesc *pd, struct pf_ > ip = mtod(m0, struct ip *); > } > > - rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); > - if (!rtisvalid(rt)) { > - ipstat_inc(ips_noroute); > - goto bad; > - } > - /* A locally generated packet may have invalid source address. */ > - if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && > - (ifp->if_flags & IFF_LOOPBACK) == 0) > - ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; > - > in_proto_cksum_out(m0, ifp); > > if (ntohs(ip->ip_len) <= ifp->if_mtu) { > @@ -6082,9 +6060,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ >*/ > if (ip->ip_off & htons(IP_DF)) { > ipstat_inc(ips_cantfrag); > - if (r->rt != PF_DUPTO) > + if (s->rt != PF_DUPTO) > pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, > - ifp->if_mtu, pd->af, r, pd->rdomain); > + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); > goto bad; here we do 'goto bad', which does not call if_put(). > } > > @@ -6108,8 +6086,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ > ipstat_inc(ips_fragmented); > > done: > - if (r->rt != PF_DUPTO) > + if (s->rt != PF_DUPTO) > pd->m = NULL; > + if_put(ifp); > rtfree(rt); > return; > pf_route6() suffers from the same issue. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > Hi, > > Some personal thoughts. I am happy when pf route-to gets simpler. > Especially I have never understood what this address@interface > syntax is used for. > > I cannot estimate what configuration is used by our cutomers in > many installations. Simple syntax change address@interface -> > address of next hob should be no problem. Slight semantic changes > have to be dealt with. Current packet flow is complicated and may > be inspired by old NAT behavior. As long it becomes more sane and > easier to understand, we should change it. I'm not sure if proposed scenario real. Let's assume there is a PF box with three NICs running on this awkward set up em1 ... 192.168.1.10 em0 em2 ... 192.168.1.10 em0 is attached to LAN em1 and em2 are facing to internet which is reachable with two different physical lines. both lines are connected via equipment, which uses fixed IP address 192.168.1.10 and PF admin has no way to change that. the 'address@interface' syntax is the only way to define rules: pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 regardless of how much real such scenario is I believe it can currently work. > > But I don't like artificial restrictions. We don't know all use > cases. reply-to and route-to could be used for both in and out > rules. I have used them for strange divert-to on bridge setups. > It should stay that way. > OK I agree. regards sashan
Re: [External] : Re: pf route-to issues
hello, > > > +pf_route(struct pf_pdesc *pd, struct pf_state *s) > ... > > + if (pd->dir == PF_IN) { > > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > Yes, this is the correct logic. When the packet comes in, pf > overrides forwarding, tests the out rules, and sends it. For > outgoing packets on out route-to rules we have already tested the > rules. It also works for reply-to the other way around. the change above changes the behavior from 'do pf_test() when packet crosses interface' to 'call pf_test() at most two times'. the change essentially breaks the behavior I've mentioned earlier. table { 10.10.10.10, 172.16.1.1 } pass out on em0 from 192.168.1.0/24 to any route-to pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all the story goes as follows: destination/next-hop 10.10.10.10 is reached via em1 destination/next-hop 172.16.1.1 is reached via em2 for arbitrary reason we have to do NAT on em1 interface. with current PF things do work. trading current 'if (...)' for 'if (pd->dir == PF_IN)' breaks it. The NAT won't happen. I think this can be solved by using 'match ... route-to ...' rule suggested by David. However... I actually start to wonder does it hurt us to keep current behavior? Why do we want to change it? I'm getting worried this particular tid-bit will bring more harm than good. It also does not help much pfsync(4), if I understand things right. > > But what about dup-to? The packet is duplicated for both directions. > I guess the main use case for dup-to is implementing a monitor port. > There you have to pass packets stateless, otherwise it would not > work anyway. The strange semantics is not related to this diff. this is a good point. thanks and regards sahan
Re: pf route-to issues
On Fri, Jan 22, 2021 at 06:07:59PM +1000, David Gwynne wrote: > --- sys/conf/GENERIC 30 Sep 2020 14:51:17 - 1.273 > +++ sys/conf/GENERIC 22 Jan 2021 07:33:30 - > @@ -82,6 +82,7 @@ pseudo-device msts1 # MSTS line discipl > pseudo-deviceendrun 1 # EndRun line discipline > pseudo-devicevnd 4 # vnode disk devices > pseudo-deviceksyms 1 # kernel symbols device > +pseudo-devicekstat > #pseudo-device dt # Dynamic Tracer > > # clonable devices This is an unrelated chunk. > +pf_route(struct pf_pdesc *pd, struct pf_state *s) ... > + if (pd->dir == PF_IN) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) Yes, this is the correct logic. When the packet comes in, pf overrides forwarding, tests the out rules, and sends it. For outgoing packets on out route-to rules we have already tested the rules. It also works for reply-to the other way around. But what about dup-to? The packet is duplicated for both directions. I guess the main use case for dup-to is implementing a monitor port. There you have to pass packets stateless, otherwise it would not work anyway. The strange semantics is not related to this diff. We are reaching a state where this diff can go in. I just startet a regress run with it. OK bluhm@
Re: [External] : Re: pf route-to issues
Hi, Some personal thoughts. I am happy when pf route-to gets simpler. Especially I have never understood what this address@interface syntax is used for. I cannot estimate what configuration is used by our cutomers in many installations. Simple syntax change address@interface -> address of next hob should be no problem. Slight semantic changes have to be dealt with. Current packet flow is complicated and may be inspired by old NAT behavior. As long it becomes more sane and easier to understand, we should change it. But I don't like artificial restrictions. We don't know all use cases. reply-to and route-to could be used for both in and out rules. I have used them for strange divert-to on bridge setups. It should stay that way. It would be nice to keep state-less route-to. I have found a special case with that in the code of our product. But it looks like dead code, so I would not object to remove state-less route-to for now. bluhm
Re: [External] : Re: pf route-to issues
Hello, > > > > If I understand the idea right, then basically 'match out on em0' > > figures out the new 'outbound interface' so either 'pass out on em1...' > > or > > 'pass out on em2...' will kick in. In other words: > > > > depending on the destination picked up from table, > > the route-to action will override the em0 interface to > > either em1 or em2. > > yes. > > i dont understand the kif lifetimes though. can we just point a > pdesc at an arbitrary kif or do we need ot reference count? > as long as we don't release NET_LOCK() (or PF_LOCK() in near future), the reference to kif should remain valid. kif is very own PF's abstraction of network interfaces. there are two ways how one can create a kif: there is either rule loaded, which refers to interface, which is not plumbed yet. the new interface gets plumbed and PF must have a kif for it. the reference count currently used makes sure we destroy kif when either interface is gone or when the rule, which refers kif is gone. hope I remember details above right. > > I think this might be way to go. > > > > My only concern now is that such change is subtle. I mean the > > > > pass out ... route-to > > > > will change behavior in sense that current code will dispatch > > packet to new interface and run pf_test() again. Once your diff > > will be in the same rule will be accepted, but will bring entirely > > different behaviour: just dispatch packet to new interface. > > yeah. > > the counter example is that i was extremely surprised when i > discovered that pf_test gets run again when the outgoing interface > is changed with a route-to rule. surprised because you've forgot about current model? which is: run pf_test() whenever packet crosses the interface? > > there's subtlety either way, we're just figuring out which one we're > going for. yes exactly. there are trade offs either way. > > I think this is acceptable. If this will cause a friction we can always > > adjust the code in follow up commit to allow state-less > > route-to/reply-to > > with no support from pfsync(4). > > if we're going to support route-to on match rules i think this will be > easy to implement. > I think there must be some broader consent on model change from current which is run pf_test() for every NIC crossing to new way, which is run pf_test() at most two times. > > > > > > > > > > > > > lastly, the "argument" or address specified with route-to (and > > > > > reply-to and dup-to) is a destination address, not a next-hop. this > > > > > has been discussed on the lists a couple of times before, so i won't > > > > > go over it again, except to reiterate that it allows pf to force > > > > > "sticky" path selection while opening up the possibility for ecmp > > > > > and failover for where that path traverses. > > > > > > > > I keep forgetting about it as I still stick to current > > > > interpretation. > > > > > > > > > > > > I've seen changes to pfctl. Diff below still allows rule: > > > > > > > > pass in on net0 from 192.168.1.0/24 to any route-to 10.10.10.10@em0 > > > > > > Is there use case for the @interface syntax apart from the current > > > route-to rules? If not, we can just delete it. > > > > perhaps I'm still not quite on the same page as you then. I also > > had no time to entirely test you diff. The way I understand your > > effort is to change route-to behavior such it will be using > > a destination instead of next-hop@interface. Or are you planning > > to keep current form ('route-to next-hop@interface') working? > > if we ignore route-to, what's the use case for the interface part of > address@interface? it doesnt seem to be accepted as part of an address > in other parts of the grammar: > > dlg@kbuild ~$ echo pass in from 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if syntax not permitted in from or to > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass from 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if syntax not permitted in from or to > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if not permitted > stdin:1: nat-to and rdr-to require a direction > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > > if route-to isn't going to use it, can we cut the @interface part out of > fthe address parser? > that's also my understanding. the form 192.168.0.0@net0 is currently used by route-to et.al. only. So we can eventually let it go. however there is a strong objection from Sven, which wants it to keep it around. > > unless we don't
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 01:11:35PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > I understand that simple is better here, so I won't object > > > if we will lean towards simplified model above. However I still > > > would like to share my view on current PF. > > > > > > the way I understand how things (should) work currently is fairly > > > simple: > > > > > > we always run pf_test() as packet crosses interface. > > > packet can cross interface either in outbound or > > > inbound direction. > > > > That's how I understand the current code. I'm proposing that we change > > the semantics so they are: > > > > - we always run pf_test as a packet enters or leaves the network stack. > > - pf is able to filter or apply policy based on various attributes > > of the packet such as addresses and ports, but also metadata about > > the packet such as the current prio, or the interface it came > > from or is going to. > > - changing a packet or it's metadata does not cause a rerun of pf_test. > > - route-to on an incoming packet basically bypasses the default > > stack processing with a "fast route" out of the stack. > > > > > this way we can always create a complex route-to loops, > > > however it can also solve some route-to vs. NAT issues. > > > consider those fairly innocent rules: > > > > > > 8<---8<---8<--8< > > > table { 10.10.10.10, 172.16.1.1 } > > > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > > pass out on em2 all > > > 8<---8<---8<--8< > > > > > > Rules above should currently work, but will stop if we will > > > go with simplified model. > > > > The entries in make the packet go out em1 and em2? > > yes they do. let's say 10.10.10.10 is reached over em1, 172.16.1.1 is > reached over em2. sorry I have not specified that in my earlier email. npz. > > > > > I'm ok with breaking configs like that. We don't run pf_test again for > > other changes to the packet, so if we do want to support something like > > that I think we should make the following work: > > > > # pf_pdesc kif is em0 > > match out on em0 from 192.168.1.0/24 to any route-to > > # pf_pdesc kif is now em1 > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > > > This is more in line with how NAT rules operate. > > If I understand the idea right, then basically 'match out on em0' > figures out the new 'outbound interface' so either 'pass out on em1...' or > 'pass out on em2...' will kick in. In other words: > > depending on the destination picked up from table, > the route-to action will override the em0 interface to > either em1 or em2. yes. i dont understand the kif lifetimes though. can we just point a pdesc at an arbitrary kif or do we need ot reference count? > I think this might be way to go. > > My only concern now is that such change is subtle. I mean the > > pass out ... route-to > > will change behavior in sense that current code will dispatch > packet to new interface and run pf_test() again. Once your diff > will be in the same rule will be accepted, but will bring entirely > different behaviour: just dispatch packet to new interface. yeah. the counter example is that i was extremely surprised when i discovered that pf_test gets run again when the outgoing interface is changed with a route-to rule. there's subtlety either way, we're just figuring out which one we're going for. > > > I'll be OK with your simplified model if it will make things > > > more explicit: > > > > > > route-to option should be applied on inbound rules > > > only > > > > This would restrict how we currently write rules. See below about how we > > would be using it. > > > > > reply-to option should be applied on outbound rule > > > only > > > > I'm using reply-to on inbound rules. On these boxes I have a service > > (it's a dns resolver running unbound) that is accessible only via > > gre(4) tunnels, and I need the replies to those connections to go > > out the same interface they came in on. I'm running an older version of > > my diff, so I can have rules like this to make it work: > > > > pass in quick on gre0 reply-to gre0:peer > > pass in quick on gre1 reply-to gre1:peer > > > > The DNS traffic isn't going through this box, the replies that > > unbound is generating match the state created by the inbound rule. > > > > If I'm remembering correctly, sthen@ had a similar use case. > > you are right, I did not think much of a local bound traffic. > in this case reply-to needs to be kept as-is. > > > > > > dup-to option can go either way (in/out) > > > > Yep. > > > > > does it make sense? IMO yes, because doing route-to > > >
Re: [External] : Re: pf route-to issues
Hello, > > > > I understand that simple is better here, so I won't object > > if we will lean towards simplified model above. However I still > > would like to share my view on current PF. > > > > the way I understand how things (should) work currently is fairly > > simple: > > > > we always run pf_test() as packet crosses interface. > > packet can cross interface either in outbound or > > inbound direction. > > That's how I understand the current code. I'm proposing that we change > the semantics so they are: > > - we always run pf_test as a packet enters or leaves the network stack. > - pf is able to filter or apply policy based on various attributes > of the packet such as addresses and ports, but also metadata about > the packet such as the current prio, or the interface it came > from or is going to. > - changing a packet or it's metadata does not cause a rerun of pf_test. > - route-to on an incoming packet basically bypasses the default > stack processing with a "fast route" out of the stack. > > > this way we can always create a complex route-to loops, > > however it can also solve some route-to vs. NAT issues. > > consider those fairly innocent rules: > > > > 8<---8<---8<--8< > > table { 10.10.10.10, 172.16.1.1 } > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > 8<---8<---8<--8< > > > > Rules above should currently work, but will stop if we will > > go with simplified model. > > The entries in make the packet go out em1 and em2? yes they do. let's say 10.10.10.10 is reached over em1, 172.16.1.1 is reached over em2. sorry I have not specified that in my earlier email. > > I'm ok with breaking configs like that. We don't run pf_test again for > other changes to the packet, so if we do want to support something like > that I think we should make the following work: > > # pf_pdesc kif is em0 > match out on em0 from 192.168.1.0/24 to any route-to > # pf_pdesc kif is now em1 > pass out on em1 from 192.168.1.0 to any nat-to (em1) > pass out on em2 all > > This is more in line with how NAT rules operate. If I understand the idea right, then basically 'match out on em0' figures out the new 'outbound interface' so either 'pass out on em1...' or 'pass out on em2...' will kick in. In other words: depending on the destination picked up from table, the route-to action will override the em0 interface to either em1 or em2. I think this might be way to go. My only concern now is that such change is subtle. I mean the pass out ... route-to will change behavior in sense that current code will dispatch packet to new interface and run pf_test() again. Once your diff will be in the same rule will be accepted, but will bring entirely different behaviour: just dispatch packet to new interface. > > > I'll be OK with your simplified model if it will make things > > more explicit: > > > > route-to option should be applied on inbound rules > > only > > This would restrict how we currently write rules. See below about how we > would be using it. > > > reply-to option should be applied on outbound rule > > only > > I'm using reply-to on inbound rules. On these boxes I have a service > (it's a dns resolver running unbound) that is accessible only via > gre(4) tunnels, and I need the replies to those connections to go > out the same interface they came in on. I'm running an older version of > my diff, so I can have rules like this to make it work: > > pass in quick on gre0 reply-to gre0:peer > pass in quick on gre1 reply-to gre1:peer > > The DNS traffic isn't going through this box, the replies that > unbound is generating match the state created by the inbound rule. > > If I'm remembering correctly, sthen@ had a similar use case. you are right, I did not think much of a local bound traffic. in this case reply-to needs to be kept as-is. > > > dup-to option can go either way (in/out) > > Yep. > > > does it make sense? IMO yes, because doing route-to > > on outbound path feels unnatural to me. > > I agree that it feels a bit unnatural, but so far all the route-to rules > I've been writing have been on pass out rules. That could be peculiar to > my setup, but we generally allow packets in on our external links, and > apply policy on the outbound interface heading towards the relevant > service. eg: > > block > pass in on $if_external > pass out on $if_webservers proto tcp to port { http https } > pass out on $if_relays proto { tcp udp } to port domain > > We'd be sprinkling route-to on these pass out rules to tie connections > to specific backends. > > > > > > > > > > > > >
Re: [External] : Re: pf route-to issues
On Sun, Jan 24, 2021 at 11:51 PM David Gwynne wrote: > On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote: > > Hello, > > > > > > > > ok. i don't know how to split up the rest of the change though. > > > > > > here's an updated diff that includes the rest of the kernel changes and > > > the pfctl and pf.conf tweaks. > > > > > > it's probably useful for me to try and explain at a high level what > > > i think the semantics should be, otherwise we might end up arguing > about > > > which bits of the current config i broke. > > > > > > so, from an extremely high level point of view, and apologies if > > > this is condescending, pf sits between the network stack and an > > > interface that a packet travels on. for connections handled by the > > > local box, this means packets come from the stack and get an output > > > interface selected by a route lookup, then pf checks it, and then > > > it goes out the selected interface. replies come into an interface, > > > get checked by pf, and then enter the stack. when forwarding, a > > > packet comes into an interface, pf checks it, the stack does a route > > > lookup to pick an interface, pf checks it again, and then it goes > > > out the interface. > > > > > > so what does it mean when route-to (or reply-to) gets involved? i'm > > > saying that when route-to is applied to a packet, pf takes the packet > > > away from the stack and immediately forwards it toward to specified > > > destination address. for a packet entering the system, ie, when the > > > packet is going from the interface into the stack, route-to should > > > pretend that it is forwarding the packet and basically push it > > > straight out an interface. however, like normal forwarding via the > > > stack, there might be some policy on packets leaving that interface > that > > > you want to apply, so pf should run pf_test in that situation so the > > > policy can be applied. this is especially useful if you need to apply > > > nat-to when packets leave a particular interface. > > > > > > however, if you route-to when a packet is on the way out of the > > > stack, i'm arguing that pf should not run again against that packet. > > > currently route-to rules run pf_test again if the interface the packet > > > is routed out of changes, which means pf runs multiple times against a > > > packet if rules keep changing which interface it goes out. this means > > > there's loop prevention in pf to mitigate against this, and weird > > > potentials for multiple states to be created when nat gets involved. > > > > > > for simplicity, both in terms of reasoning and code i think pf should > > > only be run once when a packet enters the system, and only once when it > > > leaves the system. the only reason i can come up with for running > > > pf_test multiple times when route-to changes the outgoing interface is > > > so you can check the packet with "pass out on $new_if" type rules. we > > > don't rerun pf again when nat/rdr changes addresses, so this feels > > > inconsistent to me. > > > > I understand that simple is better here, so I won't object > > if we will lean towards simplified model above. However I still > > would like to share my view on current PF. > > > > the way I understand how things (should) work currently is fairly > simple: > > > > we always run pf_test() as packet crosses interface. > > packet can cross interface either in outbound or > > inbound direction. > > That's how I understand the current code. I'm proposing that we change > the semantics so they are: > > - we always run pf_test as a packet enters or leaves the network stack. > - pf is able to filter or apply policy based on various attributes > of the packet such as addresses and ports, but also metadata about > the packet such as the current prio, or the interface it came > from or is going to. > - changing a packet or it's metadata does not cause a rerun of pf_test. > - route-to on an incoming packet basically bypasses the default > stack processing with a "fast route" out of the stack. > > > this way we can always create a complex route-to loops, > > however it can also solve some route-to vs. NAT issues. > > consider those fairly innocent rules: > > > > 8<---8<---8<--8< > > table { 10.10.10.10, 172.16.1.1 } > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > 8<---8<---8<--8< > > > > Rules above should currently work, but will stop if we will > > go with simplified model. > > The entries in make the packet go out em1 and em2? > > I'm ok with breaking configs like that. We don't run pf_test again for > other changes to the packet, so if we do want to support something like > that I think we should make the following work: > > # pf_pdesc kif is em0 >
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > ok. i don't know how to split up the rest of the change though. > > > > here's an updated diff that includes the rest of the kernel changes and > > the pfctl and pf.conf tweaks. > > > > it's probably useful for me to try and explain at a high level what > > i think the semantics should be, otherwise we might end up arguing about > > which bits of the current config i broke. > > > > so, from an extremely high level point of view, and apologies if > > this is condescending, pf sits between the network stack and an > > interface that a packet travels on. for connections handled by the > > local box, this means packets come from the stack and get an output > > interface selected by a route lookup, then pf checks it, and then > > it goes out the selected interface. replies come into an interface, > > get checked by pf, and then enter the stack. when forwarding, a > > packet comes into an interface, pf checks it, the stack does a route > > lookup to pick an interface, pf checks it again, and then it goes > > out the interface. > > > > so what does it mean when route-to (or reply-to) gets involved? i'm > > saying that when route-to is applied to a packet, pf takes the packet > > away from the stack and immediately forwards it toward to specified > > destination address. for a packet entering the system, ie, when the > > packet is going from the interface into the stack, route-to should > > pretend that it is forwarding the packet and basically push it > > straight out an interface. however, like normal forwarding via the > > stack, there might be some policy on packets leaving that interface that > > you want to apply, so pf should run pf_test in that situation so the > > policy can be applied. this is especially useful if you need to apply > > nat-to when packets leave a particular interface. > > > > however, if you route-to when a packet is on the way out of the > > stack, i'm arguing that pf should not run again against that packet. > > currently route-to rules run pf_test again if the interface the packet > > is routed out of changes, which means pf runs multiple times against a > > packet if rules keep changing which interface it goes out. this means > > there's loop prevention in pf to mitigate against this, and weird > > potentials for multiple states to be created when nat gets involved. > > > > for simplicity, both in terms of reasoning and code i think pf should > > only be run once when a packet enters the system, and only once when it > > leaves the system. the only reason i can come up with for running > > pf_test multiple times when route-to changes the outgoing interface is > > so you can check the packet with "pass out on $new_if" type rules. we > > don't rerun pf again when nat/rdr changes addresses, so this feels > > inconsistent to me. > > I understand that simple is better here, so I won't object > if we will lean towards simplified model above. However I still > would like to share my view on current PF. > > the way I understand how things (should) work currently is fairly simple: > > we always run pf_test() as packet crosses interface. > packet can cross interface either in outbound or > inbound direction. That's how I understand the current code. I'm proposing that we change the semantics so they are: - we always run pf_test as a packet enters or leaves the network stack. - pf is able to filter or apply policy based on various attributes of the packet such as addresses and ports, but also metadata about the packet such as the current prio, or the interface it came from or is going to. - changing a packet or it's metadata does not cause a rerun of pf_test. - route-to on an incoming packet basically bypasses the default stack processing with a "fast route" out of the stack. > this way we can always create a complex route-to loops, > however it can also solve some route-to vs. NAT issues. > consider those fairly innocent rules: > > 8<---8<---8<--8< > table { 10.10.10.10, 172.16.1.1 } > > pass out on em0 from 192.168.1.0/24 to any route-to > pass out on em1 from 192.168.1.0 to any nat-to (em1) > pass out on em2 all > 8<---8<---8<--8< > > Rules above should currently work, but will stop if we will > go with simplified model. The entries in make the packet go out em1 and em2? I'm ok with breaking configs like that. We don't run pf_test again for other changes to the packet, so if we do want to support something like that I think we should make the following work: # pf_pdesc kif is em0 match out on em0 from 192.168.1.0/24 to any route-to # pf_pdesc kif is now em1 pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all This is more in line with how NAT rules operate. > I'll be OK with
Re: [External] : Re: pf route-to issues
Hello, > > ok. i don't know how to split up the rest of the change though. > > here's an updated diff that includes the rest of the kernel changes and > the pfctl and pf.conf tweaks. > > it's probably useful for me to try and explain at a high level what > i think the semantics should be, otherwise we might end up arguing about > which bits of the current config i broke. > > so, from an extremely high level point of view, and apologies if > this is condescending, pf sits between the network stack and an > interface that a packet travels on. for connections handled by the > local box, this means packets come from the stack and get an output > interface selected by a route lookup, then pf checks it, and then > it goes out the selected interface. replies come into an interface, > get checked by pf, and then enter the stack. when forwarding, a > packet comes into an interface, pf checks it, the stack does a route > lookup to pick an interface, pf checks it again, and then it goes > out the interface. > > so what does it mean when route-to (or reply-to) gets involved? i'm > saying that when route-to is applied to a packet, pf takes the packet > away from the stack and immediately forwards it toward to specified > destination address. for a packet entering the system, ie, when the > packet is going from the interface into the stack, route-to should > pretend that it is forwarding the packet and basically push it > straight out an interface. however, like normal forwarding via the > stack, there might be some policy on packets leaving that interface that > you want to apply, so pf should run pf_test in that situation so the > policy can be applied. this is especially useful if you need to apply > nat-to when packets leave a particular interface. > > however, if you route-to when a packet is on the way out of the > stack, i'm arguing that pf should not run again against that packet. > currently route-to rules run pf_test again if the interface the packet > is routed out of changes, which means pf runs multiple times against a > packet if rules keep changing which interface it goes out. this means > there's loop prevention in pf to mitigate against this, and weird > potentials for multiple states to be created when nat gets involved. > > for simplicity, both in terms of reasoning and code i think pf should > only be run once when a packet enters the system, and only once when it > leaves the system. the only reason i can come up with for running > pf_test multiple times when route-to changes the outgoing interface is > so you can check the packet with "pass out on $new_if" type rules. we > don't rerun pf again when nat/rdr changes addresses, so this feels > inconsistent to me. I understand that simple is better here, so I won't object if we will lean towards simplified model above. However I still would like to share my view on current PF. the way I understand how things (should) work currently is fairly simple: we always run pf_test() as packet crosses interface. packet can cross interface either in outbound or inbound direction. this way we can always create a complex route-to loops, however it can also solve some route-to vs. NAT issues. consider those fairly innocent rules: 8<---8<---8<--8< table { 10.10.10.10, 172.16.1.1 } pass out on em0 from 192.168.1.0/24 to any route-to pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all 8<---8<---8<--8< Rules above should currently work, but will stop if we will go with simplified model. I'll be OK with your simplified model if it will make things more explicit: route-to option should be applied on inbound rules only reply-to option should be applied on outbound rule only dup-to option can go either way (in/out) does it make sense? IMO yes, because doing route-to on outbound path feels unnatural to me. > > this also breaks the ability to do route-to without states. is there a > reason to do that apart from the DSR type things? did we agree that > those use cases could be handled by sloppy states instead? If I remember correct we need to make 'keep state' mandatory for route-to so it can work well with pfsync(4), right? > > lastly, the "argument" or address specified with route-to (and > reply-to and dup-to) is a destination address, not a next-hop. this > has been discussed on the lists a couple of times before, so i won't > go over it again, except to reiterate that it allows pf to force > "sticky" path selection while opening up the possibility for ecmp > and failover for where that path traverses. I keep forgetting about it as I still stick to current interpretation. I've seen changes to pfctl. Diff below still allows rule: pass in on net0 from 192.168.1.0/24 to any route
Re: pf route-to issues
On Fri, Jan 08, 2021 at 04:43:39PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > > revision 1.294 > > date: 2003/01/02 01:56:56; author: dhartmei; state: Exp; lines: +27 -49; > > When route-to/reply-to is used in combination with address translation, > > pf_test() may be called twice for the same packet. In this case, make > > sure the translation is only applied in the second call. This solves > > the problem with state insert failures where the second pf_test() call > > tried to insert another state entry after the first call's translation. > > ok henning@, mcbride@, thanks to Joe Nall for additional testing. > > > > > > I have tested your diffs in my setup, they all pass. I have not > > tested the scenario mentioned in the commit message. Note that the > > address translation implementation in 2003 was different from what > > we have now. And sasha@'s analysis shows that the current code is > > wrong in other use cases. > > > > I've completely forgot there was a change in NAT. Therefore I could > not understand the commit message. > > > > > > > The only way to find out is to commit it. It reduces comlexity that > > noone understands. > > > > OK bluhm@ to remove the check > > > > Please leave the "if (pd->kif->pfik_ifp != ifp)" around pf_test() > > in pf_route() as it is for now. > > I agree with bluhm@ here. we should proceed with small steps in such > case and let things to settle down before making next move. ok. i don't know how to split up the rest of the change though. here's an updated diff that includes the rest of the kernel changes and the pfctl and pf.conf tweaks. it's probably useful for me to try and explain at a high level what i think the semantics should be, otherwise we might end up arguing about which bits of the current config i broke. so, from an extremely high level point of view, and apologies if this is condescending, pf sits between the network stack and an interface that a packet travels on. for connections handled by the local box, this means packets come from the stack and get an output interface selected by a route lookup, then pf checks it, and then it goes out the selected interface. replies come into an interface, get checked by pf, and then enter the stack. when forwarding, a packet comes into an interface, pf checks it, the stack does a route lookup to pick an interface, pf checks it again, and then it goes out the interface. so what does it mean when route-to (or reply-to) gets involved? i'm saying that when route-to is applied to a packet, pf takes the packet away from the stack and immediately forwards it toward to specified destination address. for a packet entering the system, ie, when the packet is going from the interface into the stack, route-to should pretend that it is forwarding the packet and basically push it straight out an interface. however, like normal forwarding via the stack, there might be some policy on packets leaving that interface that you want to apply, so pf should run pf_test in that situation so the policy can be applied. this is especially useful if you need to apply nat-to when packets leave a particular interface. however, if you route-to when a packet is on the way out of the stack, i'm arguing that pf should not run again against that packet. currently route-to rules run pf_test again if the interface the packet is routed out of changes, which means pf runs multiple times against a packet if rules keep changing which interface it goes out. this means there's loop prevention in pf to mitigate against this, and weird potentials for multiple states to be created when nat gets involved. for simplicity, both in terms of reasoning and code i think pf should only be run once when a packet enters the system, and only once when it leaves the system. the only reason i can come up with for running pf_test multiple times when route-to changes the outgoing interface is so you can check the packet with "pass out on $new_if" type rules. we don't rerun pf again when nat/rdr changes addresses, so this feels inconsistent to me. i also don't think route-to is used much. getting basic functionality working is surprisingly hard, so the complicated possibilities in the current code are almost certainly not taken advantage of. we're going to break existing configurations anyway, so if we can agree that pf only runs twice even if route-to gets involved, then i'm not going to feel bad about breaking something this complicated anyway. this also breaks the ability to do route-to without states. is there a reason to do that apart from the DSR type things? did we agree that those use cases could be handled by sloppy states instead? lastly, the "argument" or address specified with route-to (and reply-to and dup-to) is a destination address, not a next-hop. this has been discussed on the lists a couple of times before, so i won't go over it again, except to reit
Re: pf route-to issues
Hello, > > revision 1.294 > date: 2003/01/02 01:56:56; author: dhartmei; state: Exp; lines: +27 -49; > When route-to/reply-to is used in combination with address translation, > pf_test() may be called twice for the same packet. In this case, make > sure the translation is only applied in the second call. This solves > the problem with state insert failures where the second pf_test() call > tried to insert another state entry after the first call's translation. > ok henning@, mcbride@, thanks to Joe Nall for additional testing. > > > I have tested your diffs in my setup, they all pass. I have not > tested the scenario mentioned in the commit message. Note that the > address translation implementation in 2003 was different from what > we have now. And sasha@'s analysis shows that the current code is > wrong in other use cases. > I've completely forgot there was a change in NAT. Therefore I could not understand the commit message. > > The only way to find out is to commit it. It reduces comlexity that > noone understands. > > OK bluhm@ to remove the check > > Please leave the "if (pd->kif->pfik_ifp != ifp)" around pf_test() > in pf_route() as it is for now. I agree with bluhm@ here. we should proceed with small steps in such case and let things to settle down before making next move. thanks and regards sashan
Re: pf route-to issues
On Tue, Jan 05, 2021 at 10:05:39PM +1000, David Gwynne wrote: > If the idea is to avoid running most of pf_test again if route-to is > applied during ip_output, I think this tweaked diff is simpler. Is there > a valid use case for running some of pf_test again after route-to is > applied? I found the original commit that introduced this strange check. revision 1.294 date: 2003/01/02 01:56:56; author: dhartmei; state: Exp; lines: +27 -49; When route-to/reply-to is used in combination with address translation, pf_test() may be called twice for the same packet. In this case, make sure the translation is only applied in the second call. This solves the problem with state insert failures where the second pf_test() call tried to insert another state entry after the first call's translation. ok henning@, mcbride@, thanks to Joe Nall for additional testing. I have tested your diffs in my setup, they all pass. I have not tested the scenario mentioned in the commit message. Note that the address translation implementation in 2003 was different from what we have now. And sasha@'s analysis shows that the current code is wrong in other use cases. The check in pf_find_state() seems to be unrelated to the call to pf_test() in pf_route(). I have to rethink it separately. How can we figure out what happens when we remove the check? It may harm some cases and benefit others or make no sense at all. My regression test, which tests each feature individually, is not affected. The only way to find out is to commit it. It reduces comlexity that noone understands. OK bluhm@ to remove the check Please leave the "if (pd->kif->pfik_ifp != ifp)" around pf_test() in pf_route() as it is for now.
Re: pf route-to issues
Hello, sorry to come back after while... On Tue, Jan 05, 2021 at 10:05:39PM +1000, David Gwynne wrote: > On Mon, Jan 04, 2021 at 06:37:54PM +0100, Alexander Bluhm wrote: > > On Mon, Jan 04, 2021 at 11:21:50PM +1000, David Gwynne wrote: > > > this chunk pops out as a standalone change. > > > > > > having pf_find_state() return PF_PASS here means the callers short > > > circuit and let the packet go through without running it through the > > > a lot of the state handling, which includes things like protocol state > > > updates, nat, scrubbing, some pflog handling, and most importantly, > > > later calls to pf_route(). > > > > pf_route() calls pf_test() again with a different interface. > > > > The idea of this code is, that the interface which is passed to > > pf_test() from ip_output() is wrong. The call to pf_set_rt_ifp() > > changes it in the state. > > > > In the pf_test() call from ip_output() we skip the tests. We know > > they will happen in pf_test() called from pf_route(). Without this > > chunk we would do state handling twice with different interfaces. > > > > Is that analysis correct? > > I think so, but I didn't get as much time to poke at this today as I was > hoping. > > If the idea is to avoid running most of pf_test again if route-to is > applied during ip_output, I think this tweaked diff is simpler. Is there > a valid use case for running some of pf_test again after route-to is > applied? > > The pf_set_rt_ifp() stuff could be cleaned up if we can get away with > this. I think this should go in. I was trying to test this change, but I'm unable to create set up which would work on google cloud in the way I need. I suspect failing to convince underlying vswitch to carry packets between test hosts for me. as soon as I start to do something more fancy, packets start to disappear in underlying fabric... If I understand the change right we really need to take care of route-to action for inbound packet 'to convince PF' stack the packet actually enters firewall on interface desired by route-to action ordered by rule found by currently executed pf_test(). for outbound case the IP stack will be executing the pf_test(). so yes, I'm OK with this change. regards sashan
Re: pf route-to issues
On Mon, Jan 04, 2021 at 06:37:54PM +0100, Alexander Bluhm wrote: > On Mon, Jan 04, 2021 at 11:21:50PM +1000, David Gwynne wrote: > > this chunk pops out as a standalone change. > > > > having pf_find_state() return PF_PASS here means the callers short > > circuit and let the packet go through without running it through the > > a lot of the state handling, which includes things like protocol state > > updates, nat, scrubbing, some pflog handling, and most importantly, > > later calls to pf_route(). > > pf_route() calls pf_test() again with a different interface. > > The idea of this code is, that the interface which is passed to > pf_test() from ip_output() is wrong. The call to pf_set_rt_ifp() > changes it in the state. > > In the pf_test() call from ip_output() we skip the tests. We know > they will happen in pf_test() called from pf_route(). Without this > chunk we would do state handling twice with different interfaces. > > Is that analysis correct? I think so, but I didn't get as much time to poke at this today as I was hoping. If the idea is to avoid running most of pf_test again if route-to is applied during ip_output, I think this tweaked diff is simpler. Is there a valid use case for running some of pf_test again after route-to is applied? The pf_set_rt_ifp() stuff could be cleaned up if we can get away with this. Index: pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1097 diff -u -p -r1.1097 pf.c --- pf.c4 Jan 2021 12:48:27 - 1.1097 +++ pf.c5 Jan 2021 11:18:14 - @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc } *state = s; - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && - ((s->rule.ptr->rt == PF_ROUTETO && - s->rule.ptr->direction == PF_OUT) || - (s->rule.ptr->rt == PF_REPLYTO && - s->rule.ptr->direction == PF_IN))) - return (PF_PASS); return (PF_MATCH); } @@ -6049,7 +6043,7 @@ pf_route(struct pf_pdesc *pd, struct pf_ if (ifp == NULL) goto bad; - if (pd->kif->pfik_ifp != ifp) { + if (pd->dir == PF_IN) { if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) goto bad; else if (m0 == NULL) @@ -6204,7 +6198,7 @@ pf_route6(struct pf_pdesc *pd, struct pf if (ifp == NULL) goto bad; - if (pd->kif->pfik_ifp != ifp) { + if (pd->dir == PF_IN) { if (pf_test(AF_INET6, PF_OUT, ifp, &m0) != PF_PASS) goto bad; else if (m0 == NULL)
Re: pf route-to issues
Hello, On Mon, Jan 04, 2021 at 06:37:54PM +0100, Alexander Bluhm wrote: > On Mon, Jan 04, 2021 at 11:21:50PM +1000, David Gwynne wrote: > > this chunk pops out as a standalone change. > > > > having pf_find_state() return PF_PASS here means the callers short > > circuit and let the packet go through without running it through the > > a lot of the state handling, which includes things like protocol state > > updates, nat, scrubbing, some pflog handling, and most importantly, > > later calls to pf_route(). > > pf_route() calls pf_test() again with a different interface. yes, that's correct. In ideal world one of those should happen: a) no state is found, try to find matching rule b) matching rule found, create a new state if the rule orders to do so c) no rule found either, just accept packet > > The idea of this code is, that the interface which is passed to > pf_test() from ip_output() is wrong. The call to pf_set_rt_ifp() > changes it in the state. can you clarify what 'wrong' means here? > > Index: pf.c > > === > > RCS file: /cvs/src/sys/net/pf.c,v > > retrieving revision 1.1097 > > diff -u -p -r1.1097 pf.c > > --- pf.c4 Jan 2021 12:48:27 - 1.1097 > > +++ pf.c4 Jan 2021 13:08:26 - > > @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc > > } > > > > *state = s; > > - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && verify interface matches the one specified in state for outbound packet. if the interfaces don't match, then > > - ((s->rule.ptr->rt == PF_ROUTETO && > > - s->rule.ptr->direction == PF_OUT) || we must be either dealing with state created by route-to action bound to outbound rule. > > - (s->rule.ptr->rt == PF_REPLYTO && > > - s->rule.ptr->direction == PF_IN))) > > - return (PF_PASS); or reply-to action bound to inbound rule... ...if it is the case then we short circuit the state check and assume the state matches... let's consider the rules as follows: pass out from 1.2.3.4 to any route-to 10.20.30.40@em1 pass out from any to 1.2.3.4 route-to 192.168.1.10@em0 pass out on em1 from 1.2.3.4 to any nat-to (em1) let's further assume there is outbound packet on em0 interface. the packet matches the rule with 'route-to' action. state gets created: out@em0 1.2.3.4 -> a.b.c.d route-to 10.20.30.40@em1 pf_route() gets called, and packet arrives to pf_test() again as outbound on em1 interface. No state should be found, because em1 != em0. so that complex if does not kick in. Now let there be a response packet at em1 interface. It gets translated to a.b.c.d -> 1.2.3.4 and forwarded. let's further assume the packet hits (default) route, which goes via em3. There is outbound packet at em3 inspected by pf_test(): out@em3 a.b.c.d -> 1.2.3.4 no matching state is found because of em3 != em0, so packet hits the 'pass out from any to 1.2.3.4' rule. The rule creates state: out@em3 a.b.c.d -> 1.2.3.4 route-to 192.168.1.10@em0 and packet is sent out via pf_route('em0'). The packet enters pf_test() again. We find matching state this time: in@em0 1.2.3.4 -> a.b.c.d route-to@10.20.30.40@em1 let's see if the complex condition is met this time: pd->dir == PF_OUT (it is outbound packet) pd->kif == em0 (bound to em0 interface) s->rt_kif == em1(route* action is bound to em1) s->rule.ptr->direction == PF_OUT (state got created by outbound rule) s->rule.ptr == PF_ROUTETO (there is a route-to action at rule) It looks like the complex if() test is satisfied this time, so we bail out with PF_PASS. So what are consequences of skipping proper state check? first of all the state does not get updated. If the packet, which bypasses state check is SYN-ACK, the state is left in SYN-SENT state, although the connection initiator is in connected state already (assuming it receives SYN-ACK). I hope I have not missed anything in story above. it seems to me the code in question can indeed do some harm (sometimes...) thanks and regards sashan
Re: pf route-to issues
On Mon, Jan 04, 2021 at 11:21:50PM +1000, David Gwynne wrote: > this chunk pops out as a standalone change. > > having pf_find_state() return PF_PASS here means the callers short > circuit and let the packet go through without running it through the > a lot of the state handling, which includes things like protocol state > updates, nat, scrubbing, some pflog handling, and most importantly, > later calls to pf_route(). pf_route() calls pf_test() again with a different interface. The idea of this code is, that the interface which is passed to pf_test() from ip_output() is wrong. The call to pf_set_rt_ifp() changes it in the state. In the pf_test() call from ip_output() we skip the tests. We know they will happen in pf_test() called from pf_route(). Without this chunk we would do state handling twice with different interfaces. Is that analysis correct? bluhm > Index: pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1097 > diff -u -p -r1.1097 pf.c > --- pf.c 4 Jan 2021 12:48:27 - 1.1097 > +++ pf.c 4 Jan 2021 13:08:26 - > @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc > } > > *state = s; > - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && > - ((s->rule.ptr->rt == PF_ROUTETO && > - s->rule.ptr->direction == PF_OUT) || > - (s->rule.ptr->rt == PF_REPLYTO && > - s->rule.ptr->direction == PF_IN))) > - return (PF_PASS); > > return (PF_MATCH); > }
Re: pf route-to issues
Hello, > My diff removes the kif here ... > > > > > - if (rv == 0) { > > > > - s->rt_kif = r->route.kif; > > > > + if (rv == 0) > > > > s->natrule.ptr = r; > > > > - } > > ... and the {}. > > Anyway, it should not be commited without the userland part. > (and not without compling it :-) yes you are right. there was a conflict and I ignored the resulting pf.c.rej. entirely my fault. your diff looks good and should go in. The userland part will be tricky. perhaps we should fork existing `host()` function and tailor it to route-to et. al. The current `host()` as-is still tries to resolve name to interface if dns fails. We don't want it now. your diff reads OK to me. thanks and regards sashan > > bluhm
Re: pf route-to issues
On Mon, Jan 04, 2021 at 04:32:45PM +0100, Alexandr Nedvedicky wrote: > so either rt_kif must stay for a while, or your new diff (rebased on top > of > stuff committed already) must be expanded by the nit pick I've sent. The diff I sent contains this bit. I still think the merge bug is on your side. > to put it clear: I'm concerned the diff posted here: > https://marc.info/?l=openbsd-tech&m=160976516119388&w=2 > is not complete and should not be committed as is. It compiles, I recreated the diff and attached it. > > > - s->rt_kif = NULL; > > > if (!r->rt) > > > return (0); My diff removes the kif here ... > > > - if (rv == 0) { > > > - s->rt_kif = r->route.kif; > > > + if (rv == 0) > > > s->natrule.ptr = r; > > > - } ... and the {}. Anyway, it should not be commited without the userland part. (and not without compling it :-) bluhm Index: net/if_pfsync.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.280 diff -u -p -r1.280 if_pfsync.c --- net/if_pfsync.c 4 Jan 2021 12:48:27 - 1.280 +++ net/if_pfsync.c 4 Jan 2021 15:57:17 - @@ -613,6 +613,7 @@ pfsync_state_import(struct pfsync_state /* copy to state */ st->rt_addr = sp->rt_addr; + st->rt = sp->rt; st->creation = getuptime() - ntohl(sp->creation); st->expire = getuptime(); if (ntohl(sp->expire)) { @@ -643,7 +644,6 @@ pfsync_state_import(struct pfsync_state st->rule.ptr = r; st->anchor.ptr = NULL; - st->rt_kif = NULL; st->pfsync_time = getuptime(); st->sync_state = PFSYNC_S_NONE; @@ -1857,7 +1857,7 @@ pfsync_undefer(struct pfsync_deferral *p if (drop) m_freem(pd->pd_m); else { - if (st->rule.ptr->rt == PF_ROUTETO) { + if (st->rt == PF_ROUTETO) { if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af, st->direction, st->kif, pd->pd_m, NULL) != PF_PASS) { @@ -1866,11 +1866,11 @@ pfsync_undefer(struct pfsync_deferral *p } switch (st->key[PF_SK_WIRE]->af) { case AF_INET: - pf_route(&pdesc, st->rule.ptr, st); + pf_route(&pdesc, st); break; #ifdef INET6 case AF_INET6: - pf_route6(&pdesc, st->rule.ptr, st); + pf_route6(&pdesc, st); break; #endif /* INET6 */ default: Index: net/pf.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v retrieving revision 1.1097 diff -u -p -r1.1097 pf.c --- net/pf.c4 Jan 2021 12:48:27 - 1.1097 +++ net/pf.c4 Jan 2021 15:57:17 - @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc } *state = s; - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && - ((s->rule.ptr->rt == PF_ROUTETO && - s->rule.ptr->direction == PF_OUT) || - (s->rule.ptr->rt == PF_REPLYTO && - s->rule.ptr->direction == PF_IN))) - return (PF_PASS); return (PF_MATCH); } @@ -1186,6 +1180,7 @@ pf_state_export(struct pfsync_state *sp, /* copy from state */ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname)); + sp->rt = st->rt; sp->rt_addr = st->rt_addr; sp->creation = htonl(getuptime() - st->creation); expire = pf_state_expires(st); @@ -3433,16 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct struct pf_rule *r = s->rule.ptr; int rv; - s->rt_kif = NULL; - if (!r->rt) + if (r->rt == PF_NOPFROUTE) return (0); rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, &r->route, PF_SN_ROUTE); - if (rv == 0) { - s->rt_kif = r->route.kif; - s->natrule.ptr = r; - } + if (rv == 0) + s->rt = r->rt; return (rv); } @@ -5973,15 +5965,13 @@ pf_rtlabel_match(struct pf_addr *addr, s /* pf_route() may change pd->m, adjust local copies after calling */ void -pf_route(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0, *m1; struct sockaddr_in *dst, sin; struct rtentry *rt = NULL; struct ip *ip; struct ifnet*ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; int error = 0; un
Re: pf route-to issues
Hello, I'm sorry I was not clear enough in my earlier email. On Mon, Jan 04, 2021 at 03:56:45PM +0100, Alexander Bluhm wrote: > On Mon, Jan 04, 2021 at 03:26:15PM +0100, Alexandr Nedvedicky wrote: > > you refactoring diff requires a minor finishing touch to keep the > > stuff compiling: > > Did I commit something that does not compile? I just made cvs > update on another machine. There it worked. all diffs you commit compile, there are no doubts in this. I was talking about change sent here: https://marc.info/?l=openbsd-tech&m=160976516119388&w=2 diff above contains chunk here, which removes rt_kif. 8<---8<---8<---8<---8<8< Index: net/pfvar.h === RCS file: /cvs/src/sys/net/pfvar.h,v retrieving revision 1.497 diff -u -p -r1.497 pfvar.h --- net/pfvar.h 14 Oct 2020 19:22:14 - 1.497 +++ net/pfvar.h 4 Jan 2021 12:52:02 - @@ -762,7 +762,6 @@ struct pf_state { struct pf_sn_head src_nodes; struct pf_state_key *key[2]; /* addresses stack and wire */ struct pfi_kif *kif; - struct pfi_kif *rt_kif; u_int64_t packets[2]; u_int64_t bytes[2]; int32_tcreation; 8<---8<---8<---8<---8<8< > > The rt_kif in pf_state still exists. The diff below should not > be necessary. Maybe you forgot to clean pfvar.h. > so either rt_kif must stay for a while, or your new diff (rebased on top of stuff committed already) must be expanded by the nit pick I've sent. to put it clear: I'm concerned the diff posted here: https://marc.info/?l=openbsd-tech&m=160976516119388&w=2 is not complete and should not be committed as is. thanks and regards sashan > > > 8<---8<---8<---8<---8<8< > > diff --git a/sys/net/pf.c b/sys/net/pf.c > > index b8766df1686..3f9f5b13add 100644 > > --- a/sys/net/pf.c > > +++ b/sys/net/pf.c > > @@ -3428,16 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct pf_addr > > *saddr, sa_family_t af, > > struct pf_rule *r = s->rule.ptr; > > int rv; > > > > - s->rt_kif = NULL; > > if (!r->rt) > > return (0); > > > > rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, > > &r->route, PF_SN_ROUTE); > > - if (rv == 0) { > > - s->rt_kif = r->route.kif; > > + if (rv == 0) > > s->natrule.ptr = r; > > - } > > > > return (rv); > > } > > 8<---8<---8<---8<---8<8< > > > > > > thanks and > > regards > > sashan >
Re: pf route-to issues
On Mon, Jan 04, 2021 at 03:26:15PM +0100, Alexandr Nedvedicky wrote: > you refactoring diff requires a minor finishing touch to keep the > stuff compiling: Did I commit something that does not compile? I just made cvs update on another machine. There it worked. The rt_kif in pf_state still exists. The diff below should not be necessary. Maybe you forgot to clean pfvar.h. bluhm > 8<---8<---8<---8<---8<8< > diff --git a/sys/net/pf.c b/sys/net/pf.c > index b8766df1686..3f9f5b13add 100644 > --- a/sys/net/pf.c > +++ b/sys/net/pf.c > @@ -3428,16 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct pf_addr > *saddr, sa_family_t af, > struct pf_rule *r = s->rule.ptr; > int rv; > > - s->rt_kif = NULL; > if (!r->rt) > return (0); > > rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, > &r->route, PF_SN_ROUTE); > - if (rv == 0) { > - s->rt_kif = r->route.kif; > + if (rv == 0) > s->natrule.ptr = r; > - } > > return (rv); > } > 8<---8<---8<---8<---8<8< > > > thanks and > regards > sashan
Re: pf route-to issues
Hello, On Mon, Jan 04, 2021 at 01:57:24PM +0100, Alexander Bluhm wrote: > On Mon, Jan 04, 2021 at 11:46:16AM +0100, Alexandr Nedvedicky wrote: > > > let's put this in and then i'll have a look. ok by me. > > bluhm's diff is fine with me. > > Refactoring is commited, here is the remaining kernel diff after merge. > > bluhm > > Index: net/pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1097 > diff -u -p -r1.1097 pf.c > --- net/pf.c 4 Jan 2021 12:48:27 - 1.1097 > +++ net/pf.c 4 Jan 2021 12:52:02 - > @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc > } > > *state = s; > - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && > - ((s->rule.ptr->rt == PF_ROUTETO && > - s->rule.ptr->direction == PF_OUT) || > - (s->rule.ptr->rt == PF_REPLYTO && > - s->rule.ptr->direction == PF_IN))) > - return (PF_PASS); > > return (PF_MATCH); > } I've sent OK to dlg for the chunk above. I think it's worth to have this fix in separate changeset. you refactoring diff requires a minor finishing touch to keep the stuff compiling: 8<---8<---8<---8<---8<8< diff --git a/sys/net/pf.c b/sys/net/pf.c index b8766df1686..3f9f5b13add 100644 --- a/sys/net/pf.c +++ b/sys/net/pf.c @@ -3428,16 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct pf_addr *saddr, sa_family_t af, struct pf_rule *r = s->rule.ptr; int rv; - s->rt_kif = NULL; if (!r->rt) return (0); rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, &r->route, PF_SN_ROUTE); - if (rv == 0) { - s->rt_kif = r->route.kif; + if (rv == 0) s->natrule.ptr = r; - } return (rv); } 8<---8<---8<---8<---8<8< thanks and regards sashan
Re: pf route-to issues
Hello, On Mon, Jan 04, 2021 at 11:21:50PM +1000, David Gwynne wrote: > On Mon, Jan 04, 2021 at 01:57:24PM +0100, Alexander Bluhm wrote: > > On Mon, Jan 04, 2021 at 11:46:16AM +0100, Alexandr Nedvedicky wrote: > > > > let's put this in and then i'll have a look. ok by me. > > > bluhm's diff is fine with me. > > > > Refactoring is commited, here is the remaining kernel diff after merge. > > this chunk pops out as a standalone change. > > having pf_find_state() return PF_PASS here means the callers short > circuit and let the packet go through without running it through the > a lot of the state handling, which includes things like protocol state > updates, nat, scrubbing, some pflog handling, and most importantly, > later calls to pf_route(). > > ok? I think this should go in. I've seen it in bluhm's larger diff, which I still need to finish. I'm fine if change will be committed as it solves a real bug. OK sashan > > Index: pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1097 > diff -u -p -r1.1097 pf.c > --- pf.c 4 Jan 2021 12:48:27 - 1.1097 > +++ pf.c 4 Jan 2021 13:08:26 - > @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc > } > > *state = s; > - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && > - ((s->rule.ptr->rt == PF_ROUTETO && > - s->rule.ptr->direction == PF_OUT) || > - (s->rule.ptr->rt == PF_REPLYTO && > - s->rule.ptr->direction == PF_IN))) > - return (PF_PASS); > > return (PF_MATCH); > }
Re: pf route-to issues
On Mon, Jan 04, 2021 at 01:57:24PM +0100, Alexander Bluhm wrote: > On Mon, Jan 04, 2021 at 11:46:16AM +0100, Alexandr Nedvedicky wrote: > > > let's put this in and then i'll have a look. ok by me. > > bluhm's diff is fine with me. > > Refactoring is commited, here is the remaining kernel diff after merge. this chunk pops out as a standalone change. having pf_find_state() return PF_PASS here means the callers short circuit and let the packet go through without running it through the a lot of the state handling, which includes things like protocol state updates, nat, scrubbing, some pflog handling, and most importantly, later calls to pf_route(). ok? Index: pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1097 diff -u -p -r1.1097 pf.c --- pf.c4 Jan 2021 12:48:27 - 1.1097 +++ pf.c4 Jan 2021 13:08:26 - @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc } *state = s; - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && - ((s->rule.ptr->rt == PF_ROUTETO && - s->rule.ptr->direction == PF_OUT) || - (s->rule.ptr->rt == PF_REPLYTO && - s->rule.ptr->direction == PF_IN))) - return (PF_PASS); return (PF_MATCH); }
Re: pf route-to issues
On Mon, Jan 04, 2021 at 11:46:16AM +0100, Alexandr Nedvedicky wrote: > > let's put this in and then i'll have a look. ok by me. > bluhm's diff is fine with me. Refactoring is commited, here is the remaining kernel diff after merge. bluhm Index: net/if_pfsync.c === RCS file: /cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.280 diff -u -p -r1.280 if_pfsync.c --- net/if_pfsync.c 4 Jan 2021 12:48:27 - 1.280 +++ net/if_pfsync.c 4 Jan 2021 12:52:01 - @@ -613,6 +613,7 @@ pfsync_state_import(struct pfsync_state /* copy to state */ st->rt_addr = sp->rt_addr; + st->rt = sp->rt; st->creation = getuptime() - ntohl(sp->creation); st->expire = getuptime(); if (ntohl(sp->expire)) { @@ -643,7 +644,6 @@ pfsync_state_import(struct pfsync_state st->rule.ptr = r; st->anchor.ptr = NULL; - st->rt_kif = NULL; st->pfsync_time = getuptime(); st->sync_state = PFSYNC_S_NONE; @@ -1857,7 +1857,7 @@ pfsync_undefer(struct pfsync_deferral *p if (drop) m_freem(pd->pd_m); else { - if (st->rule.ptr->rt == PF_ROUTETO) { + if (st->rt == PF_ROUTETO) { if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af, st->direction, st->kif, pd->pd_m, NULL) != PF_PASS) { @@ -1866,11 +1866,11 @@ pfsync_undefer(struct pfsync_deferral *p } switch (st->key[PF_SK_WIRE]->af) { case AF_INET: - pf_route(&pdesc, st->rule.ptr, st); + pf_route(&pdesc, st); break; #ifdef INET6 case AF_INET6: - pf_route6(&pdesc, st->rule.ptr, st); + pf_route6(&pdesc, st); break; #endif /* INET6 */ default: Index: net/pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1097 diff -u -p -r1.1097 pf.c --- net/pf.c4 Jan 2021 12:48:27 - 1.1097 +++ net/pf.c4 Jan 2021 12:52:02 - @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc } *state = s; - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && - ((s->rule.ptr->rt == PF_ROUTETO && - s->rule.ptr->direction == PF_OUT) || - (s->rule.ptr->rt == PF_REPLYTO && - s->rule.ptr->direction == PF_IN))) - return (PF_PASS); return (PF_MATCH); } @@ -1186,6 +1180,7 @@ pf_state_export(struct pfsync_state *sp, /* copy from state */ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname)); + sp->rt = st->rt; sp->rt_addr = st->rt_addr; sp->creation = htonl(getuptime() - st->creation); expire = pf_state_expires(st); @@ -3433,16 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct struct pf_rule *r = s->rule.ptr; int rv; - s->rt_kif = NULL; - if (!r->rt) + if (r->rt == PF_NOPFROUTE) return (0); rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, &r->route, PF_SN_ROUTE); - if (rv == 0) { - s->rt_kif = r->route.kif; - s->natrule.ptr = r; - } + if (rv == 0) + s->rt = r->rt; return (rv); } @@ -5973,15 +5965,13 @@ pf_rtlabel_match(struct pf_addr *addr, s /* pf_route() may change pd->m, adjust local copies after calling */ void -pf_route(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0, *m1; struct sockaddr_in *dst, sin; struct rtentry *rt = NULL; struct ip *ip; struct ifnet*ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; int error = 0; unsigned int rtableid; @@ -5991,11 +5981,11 @@ pf_route(struct pf_pdesc *pd, struct pf_ return; } - if (r->rt == PF_DUPTO) { + if (s->rt == PF_DUPTO) { if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL) return; } else { - if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir)) + if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir)) return; m0 = pd->m; } @@ -6008,44 +5998,31 @@ pf_route(struct pf_pdesc *pd, struct pf_ ip = mtod(m0, struct ip *); - memset(&sin, 0, sizeof(sin)); - dst = &sin; - dst->sin_family = AF_INET; - ds
Re: pf route-to issues
> On 4 Jan 2021, at 9:27 pm, Alexandr Nedvedicky > wrote: > > Hello, > > there is one more thing, which just came up on my mind. > > >> >> so i want to change route-to in pfctl so it takes a nexthop instead >> of an interface. you could argue that pf already lets you do this, >> because there's some bs nexthop@interface syntax. my counter argument >> is that the interface the nexthop is reachable over is redundant, and it >> makes fixing some of the other problems harder if we keep it. >> > >what is your plan for dup-to then? if my understanding of dup-to is >correct, then it allows administrator to copy matching packets and send >them out dedicated interface so another physical box (box with running >snort) can intercept them and process them. > >I remember we had to do some assumptions about this, when porting PF to >Solaris. So Solaris interpretation of option > > 'dup-to net12' > >is to send out copy of matching packet via net12 interface. because there >is no next-hop specified, we just use link broadcast when pushing out the >packet to network. I agree this is a hack. If route-to will be changed >to accept next-hop instead of interface, then we will be able to kill >such hack. route-to, reply-to, and dup-to take an address as an argument. In dup-to's case, the packet is duplicated and then the copy is routed toward the destination. As discussed previously, the address argument is a destination, it's not a link local address per se. The destination address could be a directly connected or link local address, but it could also be via a gateway (including a those learnt by a dynamic routing protocol), and/or via multiple gateways. This allows for failover or ECMP like we get with packets where the path is selected by the rtable lookup. dlg > > > >> >> if we limit the information needed for pf_route to a nexthop address, >> and which direction the address is used, this is doable. both the >> pf_state and pfsync_state structs already contain an address to store a >> nexthop in, i just had to move the route-to direction from the rule into >> the state. this is easy with pf_state, but i used a spare pad field in >> pfsync_state for this. >> > >this should be fine, because route-to et.al. don't work with 'block' rules. > > > thanks and > regards > sashan
Re: pf route-to issues
Hello, there is one more thing, which just came up on my mind. > > so i want to change route-to in pfctl so it takes a nexthop instead > of an interface. you could argue that pf already lets you do this, > because there's some bs nexthop@interface syntax. my counter argument > is that the interface the nexthop is reachable over is redundant, and it > makes fixing some of the other problems harder if we keep it. > what is your plan for dup-to then? if my understanding of dup-to is correct, then it allows administrator to copy matching packets and send them out dedicated interface so another physical box (box with running snort) can intercept them and process them. I remember we had to do some assumptions about this, when porting PF to Solaris. So Solaris interpretation of option 'dup-to net12' is to send out copy of matching packet via net12 interface. because there is no next-hop specified, we just use link broadcast when pushing out the packet to network. I agree this is a hack. If route-to will be changed to accept next-hop instead of interface, then we will be able to kill such hack. > > if we limit the information needed for pf_route to a nexthop address, > and which direction the address is used, this is doable. both the > pf_state and pfsync_state structs already contain an address to store a > nexthop in, i just had to move the route-to direction from the rule into > the state. this is easy with pf_state, but i used a spare pad field in > pfsync_state for this. > this should be fine, because route-to et.al. don't work with 'block' rules. thanks and regards sashan
Re: pf route-to issues
Hello, > > the stack should provide it on a 4 byte boundary, but it has uint64_t > members. however, it is also __packed, so the compiler makes no > assumptions about alignment. > > > Is there anything else that can be split out easily? > > let's put this in and then i'll have a look. ok by me. > bluhm's diff is fine with me. thanks and regards sashan
Re: pf route-to issues
On Mon, Jan 04, 2021 at 12:58:17AM +0100, Alexander Bluhm wrote: > On Sun, Jan 03, 2021 at 06:56:20PM +0100, Alexander Bluhm wrote: > > I am currently running a full regress to find more fallout. > > These regress tests fail: > > sys/net/pf_forward > sys/net/pf_fragment > sbin/pfctl > > The first two are easy to fix. That means my tests using route-to > work fine with your diff. Just remove the @interface as below. pretty much, yes. > pfctl tests pfail8 and pf13 use very strange routespec syntax. You > might want to take a look at what that meant before and what should > be valid now. this is another syntax which we seem to agree is confusing. this makes me more convinced that it needs to be changed. pfail8.in and pf13.in should be modified to route-to an IP address instead of an interface. these regress tests are a bit confusing because they're just testing the parser and the addresses that they're using aren't configured anywhere. pfail8.ok shows that pfctl should generate some more specific error messages, which is easily fixed. > > bluhm > > Index: regress/sys/net/pf_forward/pf.conf > === > RCS file: /mount/openbsd/cvs/src/regress/sys/net/pf_forward/pf.conf,v > retrieving revision 1.5 > diff -u -p -r1.5 pf.conf > --- regress/sys/net/pf_forward/pf.conf11 Jan 2018 03:23:16 - > 1.5 > +++ regress/sys/net/pf_forward/pf.conf3 Jan 2021 23:26:54 - > @@ -17,22 +17,22 @@ pass out inet6 > pass in to $AF_IN6/64 af-to inet from $PF_OUT to $ECO_IN/24 tag af > pass out inettagged af > > -pass in to $RTT_IN/24 route-to $RT_IN@$PF_IFOUT tag rttin > -pass out tagged rttin > -pass in to $RTT_IN6/64 route-to $RT_IN6@$PF_IFOUT tag rttin > -pass out tagged rttin > +pass in to $RTT_IN/24 route-to $RT_IN tag rttin > +pass out tagged rttin > +pass in to $RTT_IN6/64 route-to $RT_IN6 tag rttin > +pass out tagged rttin > > -pass in to $RTT_OUT/24 tag rttout > -pass out route-to $RT_IN@$PF_IFOUT tagged rttout > -pass in to $RTT_OUT6/64tag rttout > -pass out route-to $RT_IN6@$PF_IFOUT tagged rttout > +pass in to $RTT_OUT/24 tag rttout > +pass out route-to $RT_IN tagged rttout > +pass in to $RTT_OUT6/64 tag rttout > +pass out route-to $RT_IN6 tagged rttout > > -pass in from $RPT_IN/24 reply-to $SRC_OUT@$PF_IFIN tag rptin > -pass out tagged rptin > -pass in from $RPT_IN6/64 reply-to $SRC_OUT6@$PF_IFIN tag rptin > -pass out tagged rptin > +pass in from $RPT_IN/24 reply-to $SRC_OUT tag rptin > +pass out tagged rptin > +pass in from $RPT_IN6/64 reply-to $SRC_OUT6 tag rptin > +pass out tagged rptin > > -pass in from $RPT_OUT/24 tag rptout > -pass out reply-to $SRC_OUT@$PF_IFIN tagged rptout > -pass in from $RPT_OUT6/64 tag rptout > -pass out reply-to $SRC_OUT6@$PF_IFIN tagged rptout > +pass in from $RPT_OUT/24 tag rptout > +pass out reply-to $SRC_OUT tagged rptout > +pass in from $RPT_OUT6/64tag rptout > +pass out reply-to $SRC_OUT6 tagged rptout > Index: regress/sys/net/pf_fragment/pf.conf > === > RCS file: /mount/openbsd/cvs/src/regress/sys/net/pf_fragment/pf.conf,v > retrieving revision 1.5 > diff -u -p -r1.5 pf.conf > --- regress/sys/net/pf_fragment/pf.conf 7 Jun 2017 20:09:07 - > 1.5 > +++ regress/sys/net/pf_fragment/pf.conf 3 Jan 2021 23:28:07 - > @@ -10,7 +10,7 @@ pass outnat-to $PF_OUT > pass in to $RDR_IN6/64 rdr-to $ECO_IN6 allow-opts tag rdr > pass outnat-to $PF_OUT6 allow-opts tagged rdr > > -pass in to $RTT_IN/24 allow-opts tag rtt > -pass outroute-to $RT_IN@$PF_IFOUT allow-opts tagged rtt > -pass in to $RTT_IN6/64allow-opts tag rtt > -pass outroute-to $RT_IN6@$PF_IFOUT allow-opts tagged rtt > +pass in to $RTT_IN/24 allow-opts tag rtt > +pass outroute-to $RT_IN allow-opts tagged rtt > +pass in to $RTT_IN6/64 allow-opts tag rtt > +pass outroute-to $RT_IN6 allow-opts tagged rtt >
Re: pf route-to issues
On Sun, Jan 03, 2021 at 06:56:20PM +0100, Alexander Bluhm wrote: > On Sun, Jan 03, 2021 at 02:00:00PM +1000, David Gwynne wrote: > > On Tue, Oct 20, 2020 at 09:27:09AM +1000, David Gwynne wrote: > > We've been running this diff in production for the last couple of > > months, and it's been solid for us so far. Ignoring the fixes for > > crashes, I personally find it a lot more usable than the current > > route-to rules too. > > > > Can I commit it? > > The diff is quite large and does multiple things at a time. In hindsight I agree. It was hard to see while being so close to it. > In general I also did not understand why I have to say em0@10.0.0.1 > for routing and it took me a while to figure out what to put into > pf.conf. I use this syntax in /usr/src/regress/sys/net/pf_forward/pf.conf. > This has to be fixed after this goes in. I will care about regress > as this test is quite complex an needs several machines to setup. > I am currently running a full regress to find more fallout. > > I do not use pfsync, so I cannot say what the consequences of the > change are in this area. Also I don't know why pf-route interfaces > were designed in such a strange way. we do use pfsync, and not being able to use it with route-to has been a point of friction for us for years. as for the design, i think it was copied (imperfectly) from ipfilter. look for "Policy Based Routing" in https://www.freebsd.org/cgi/man.cgi?query=ipf&sektion=5. > From a user perspective it is not clear, why route-to should not > work together with no-state. So we should either fix it or document > it and add a check in the parser. Is fixing hard? pf_route only takes a state now, so i'd say it is non-trivial. for now i'll go with documentation and a check in the parser.. > Are we losing any other features apart from this strange arp reuse > you described in your mail? i wouldn't say the arp reuse is a feature. > There is some refactoring in your diff. I splitted it to make > review easier. I think this should go in first. Note that the > pf_state variable is called st in if_pfsync.c. Can we be consistent > here? Is the pfsync_state properly aligned? During import it comes > from an mbuf. the stack should provide it on a 4 byte boundary, but it has uint64_t members. however, it is also __packed, so the compiler makes no assumptions about alignment. > Is there anything else that can be split out easily? let's put this in and then i'll have a look. ok by me. > > bluhm > > Index: net/if_pfsync.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v > retrieving revision 1.279 > diff -u -p -r1.279 if_pfsync.c > --- net/if_pfsync.c 12 Dec 2020 11:49:02 - 1.279 > +++ net/if_pfsync.c 3 Jan 2021 17:16:55 - > @@ -612,7 +612,7 @@ pfsync_state_import(struct pfsync_state > st->rtableid[PF_SK_STACK] = ntohl(sp->rtableid[PF_SK_STACK]); > > /* copy to state */ > - bcopy(&sp->rt_addr, &st->rt_addr, sizeof(st->rt_addr)); > + st->rt_addr = sp->rt_addr; > st->creation = getuptime() - ntohl(sp->creation); > st->expire = getuptime(); > if (ntohl(sp->expire)) { > @@ -1843,6 +1843,7 @@ pfsync_undefer(struct pfsync_deferral *p > { > struct pfsync_softc *sc = pfsyncif; > struct pf_pdesc pdesc; > + struct pf_state *st = pd->pd_st; > > NET_ASSERT_LOCKED(); > > @@ -1852,35 +1853,32 @@ pfsync_undefer(struct pfsync_deferral *p > TAILQ_REMOVE(&sc->sc_deferrals, pd, pd_entry); > sc->sc_deferred--; > > - CLR(pd->pd_st->state_flags, PFSTATE_ACK); > + CLR(st->state_flags, PFSTATE_ACK); > if (drop) > m_freem(pd->pd_m); > else { > - if (pd->pd_st->rule.ptr->rt == PF_ROUTETO) { > - if (pf_setup_pdesc(&pdesc, > - pd->pd_st->key[PF_SK_WIRE]->af, > - pd->pd_st->direction, pd->pd_st->rt_kif, > - pd->pd_m, NULL) != PF_PASS) { > + if (st->rule.ptr->rt == PF_ROUTETO) { > + if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af, > + st->direction, st->kif, pd->pd_m, NULL) != > + PF_PASS) { > m_freem(pd->pd_m); > goto out; > } > - switch (pd->pd_st->key[PF_SK_WIRE]->af) { > + switch (st->key[PF_SK_WIRE]->af) { > case AF_INET: > - pf_route(&pdesc, > - pd->pd_st->rule.ptr, pd->pd_st); > + pf_route(&pdesc, st->rule.ptr, st); > break; > #ifdef INET6 > case AF_INET6: > - pf_route6(&pdesc, > - pd->pd_st->rule.ptr, pd->pd_st); > +
Re: pf route-to issues
On Sun, Jan 03, 2021 at 06:56:20PM +0100, Alexander Bluhm wrote: > I am currently running a full regress to find more fallout. These regress tests fail: sys/net/pf_forward sys/net/pf_fragment sbin/pfctl The first two are easy to fix. That means my tests using route-to work fine with your diff. Just remove the @interface as below. pfctl tests pfail8 and pf13 use very strange routespec syntax. You might want to take a look at what that meant before and what should be valid now. bluhm Index: regress/sys/net/pf_forward/pf.conf === RCS file: /mount/openbsd/cvs/src/regress/sys/net/pf_forward/pf.conf,v retrieving revision 1.5 diff -u -p -r1.5 pf.conf --- regress/sys/net/pf_forward/pf.conf 11 Jan 2018 03:23:16 - 1.5 +++ regress/sys/net/pf_forward/pf.conf 3 Jan 2021 23:26:54 - @@ -17,22 +17,22 @@ pass out inet6 pass in to $AF_IN6/64 af-to inet from $PF_OUT to $ECO_IN/24 tag af pass out inettagged af -pass in to $RTT_IN/24 route-to $RT_IN@$PF_IFOUT tag rttin -pass out tagged rttin -pass in to $RTT_IN6/64 route-to $RT_IN6@$PF_IFOUT tag rttin -pass out tagged rttin +pass in to $RTT_IN/24 route-to $RT_IN tag rttin +pass out tagged rttin +pass in to $RTT_IN6/64 route-to $RT_IN6 tag rttin +pass out tagged rttin -pass in to $RTT_OUT/24 tag rttout -pass out route-to $RT_IN@$PF_IFOUT tagged rttout -pass in to $RTT_OUT6/64tag rttout -pass out route-to $RT_IN6@$PF_IFOUT tagged rttout +pass in to $RTT_OUT/24 tag rttout +pass out route-to $RT_IN tagged rttout +pass in to $RTT_OUT6/64 tag rttout +pass out route-to $RT_IN6 tagged rttout -pass in from $RPT_IN/24 reply-to $SRC_OUT@$PF_IFIN tag rptin -pass out tagged rptin -pass in from $RPT_IN6/64 reply-to $SRC_OUT6@$PF_IFIN tag rptin -pass out tagged rptin +pass in from $RPT_IN/24 reply-to $SRC_OUT tag rptin +pass out tagged rptin +pass in from $RPT_IN6/64 reply-to $SRC_OUT6 tag rptin +pass out tagged rptin -pass in from $RPT_OUT/24 tag rptout -pass out reply-to $SRC_OUT@$PF_IFIN tagged rptout -pass in from $RPT_OUT6/64 tag rptout -pass out reply-to $SRC_OUT6@$PF_IFIN tagged rptout +pass in from $RPT_OUT/24 tag rptout +pass out reply-to $SRC_OUT tagged rptout +pass in from $RPT_OUT6/64tag rptout +pass out reply-to $SRC_OUT6 tagged rptout Index: regress/sys/net/pf_fragment/pf.conf === RCS file: /mount/openbsd/cvs/src/regress/sys/net/pf_fragment/pf.conf,v retrieving revision 1.5 diff -u -p -r1.5 pf.conf --- regress/sys/net/pf_fragment/pf.conf 7 Jun 2017 20:09:07 - 1.5 +++ regress/sys/net/pf_fragment/pf.conf 3 Jan 2021 23:28:07 - @@ -10,7 +10,7 @@ pass outnat-to $PF_OUT pass in to $RDR_IN6/64 rdr-to $ECO_IN6 allow-opts tag rdr pass outnat-to $PF_OUT6 allow-opts tagged rdr -pass in to $RTT_IN/24 allow-opts tag rtt -pass outroute-to $RT_IN@$PF_IFOUT allow-opts tagged rtt -pass in to $RTT_IN6/64allow-opts tag rtt -pass outroute-to $RT_IN6@$PF_IFOUT allow-opts tagged rtt +pass in to $RTT_IN/24 allow-opts tag rtt +pass outroute-to $RT_IN allow-opts tagged rtt +pass in to $RTT_IN6/64 allow-opts tag rtt +pass outroute-to $RT_IN6 allow-opts tagged rtt
Re: pf route-to issues
On Sun, Jan 03, 2021 at 02:00:00PM +1000, David Gwynne wrote: > On Tue, Oct 20, 2020 at 09:27:09AM +1000, David Gwynne wrote: > We've been running this diff in production for the last couple of > months, and it's been solid for us so far. Ignoring the fixes for > crashes, I personally find it a lot more usable than the current > route-to rules too. > > Can I commit it? The diff is quite large and does multiple things at a time. In general I also did not understand why I have to say em0@10.0.0.1 for routing and it took me a while to figure out what to put into pf.conf. I use this syntax in /usr/src/regress/sys/net/pf_forward/pf.conf. This has to be fixed after this goes in. I will care about regress as this test is quite complex an needs several machines to setup. I am currently running a full regress to find more fallout. I do not use pfsync, so I cannot say what the consequences of the change are in this area. Also I don't know why pf-route interfaces were designed in such a strange way. >From a user perspective it is not clear, why route-to should not work together with no-state. So we should either fix it or document it and add a check in the parser. Is fixing hard? Are we losing any other features apart from this strange arp reuse you described in your mail? There is some refactoring in your diff. I splitted it to make review easier. I think this should go in first. Note that the pf_state variable is called st in if_pfsync.c. Can we be consistent here? Is the pfsync_state properly aligned? During import it comes from an mbuf. Is there anything else that can be split out easily? bluhm Index: net/if_pfsync.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.279 diff -u -p -r1.279 if_pfsync.c --- net/if_pfsync.c 12 Dec 2020 11:49:02 - 1.279 +++ net/if_pfsync.c 3 Jan 2021 17:16:55 - @@ -612,7 +612,7 @@ pfsync_state_import(struct pfsync_state st->rtableid[PF_SK_STACK] = ntohl(sp->rtableid[PF_SK_STACK]); /* copy to state */ - bcopy(&sp->rt_addr, &st->rt_addr, sizeof(st->rt_addr)); + st->rt_addr = sp->rt_addr; st->creation = getuptime() - ntohl(sp->creation); st->expire = getuptime(); if (ntohl(sp->expire)) { @@ -1843,6 +1843,7 @@ pfsync_undefer(struct pfsync_deferral *p { struct pfsync_softc *sc = pfsyncif; struct pf_pdesc pdesc; + struct pf_state *st = pd->pd_st; NET_ASSERT_LOCKED(); @@ -1852,35 +1853,32 @@ pfsync_undefer(struct pfsync_deferral *p TAILQ_REMOVE(&sc->sc_deferrals, pd, pd_entry); sc->sc_deferred--; - CLR(pd->pd_st->state_flags, PFSTATE_ACK); + CLR(st->state_flags, PFSTATE_ACK); if (drop) m_freem(pd->pd_m); else { - if (pd->pd_st->rule.ptr->rt == PF_ROUTETO) { - if (pf_setup_pdesc(&pdesc, - pd->pd_st->key[PF_SK_WIRE]->af, - pd->pd_st->direction, pd->pd_st->rt_kif, - pd->pd_m, NULL) != PF_PASS) { + if (st->rule.ptr->rt == PF_ROUTETO) { + if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af, + st->direction, st->kif, pd->pd_m, NULL) != + PF_PASS) { m_freem(pd->pd_m); goto out; } - switch (pd->pd_st->key[PF_SK_WIRE]->af) { + switch (st->key[PF_SK_WIRE]->af) { case AF_INET: - pf_route(&pdesc, - pd->pd_st->rule.ptr, pd->pd_st); + pf_route(&pdesc, st->rule.ptr, st); break; #ifdef INET6 case AF_INET6: - pf_route6(&pdesc, - pd->pd_st->rule.ptr, pd->pd_st); + pf_route6(&pdesc, st->rule.ptr, st); break; #endif /* INET6 */ default: - unhandled_af(pd->pd_st->key[PF_SK_WIRE]->af); + unhandled_af(st->key[PF_SK_WIRE]->af); } pd->pd_m = pdesc.m; } else { - switch (pd->pd_st->key[PF_SK_WIRE]->af) { + switch (st->key[PF_SK_WIRE]->af) { case AF_INET: ip_output(pd->pd_m, NULL, NULL, 0, NULL, NULL, 0); @@ -1892,12 +1890,12 @@ pfsync_undefer(struct pfsync_deferral *p break; #endif /* INET6 */ default: - unhandled_af(pd->pd_st->key[PF_SK_WIRE
Re: pf route-to issues
On Tue, Oct 20, 2020 at 09:27:09AM +1000, David Gwynne wrote: > > i am feeling very warm and fuzzy about this diff at the moment. We've been running this diff in production for the last couple of months, and it's been solid for us so far. Ignoring the fixes for crashes, I personally find it a lot more usable than the current route-to rules too. Can I commit it? Index: sbin/pfctl/parse.y === RCS file: /cvs/src/sbin/pfctl/parse.y,v retrieving revision 1.707 diff -u -p -r1.707 parse.y --- sbin/pfctl/parse.y 16 Dec 2020 18:01:16 - 1.707 +++ sbin/pfctl/parse.y 3 Jan 2021 03:53:02 - @@ -276,6 +276,7 @@ struct filter_opts { struct redirspec nat; struct redirspec rdr; struct redirspec rroute; + u_int8_t rt; /* scrub opts */ int nodf; @@ -284,15 +285,6 @@ struct filter_opts { int randomid; int max_mss; - /* route opts */ - struct { - struct node_host*host; - u_int8_t rt; - u_int8_t pool_opts; - sa_family_t af; - struct pf_poolhashkey *key; - }route; - struct { u_int32_t limit; u_int32_t seconds; @@ -518,7 +510,6 @@ int parseport(char *, struct range *r, i %type ipspec xhost host dynaddr host_list %type table_host_list tablespec %type redir_host_list redirspec -%type route_host route_host_list routespec %typeos xos os_list %type portspec port_list port_item %type uids uid_list uid_item @@ -975,7 +966,7 @@ anchorrule : ANCHOR anchorname dir quick YYERROR; } - if ($9.route.rt) { + if ($9.rt) { yyerror("cannot specify route handling " "on anchors"); YYERROR; @@ -1843,37 +1834,13 @@ pfrule : action dir logquick interface decide_address_family($7.src.host, &r.af); decide_address_family($7.dst.host, &r.af); - if ($8.route.rt) { - if (!r.direction) { + if ($8.rt) { + if ($8.rt != PF_DUPTO && !r.direction) { yyerror("direction must be explicit " "with rules that specify routing"); YYERROR; } - r.rt = $8.route.rt; - r.route.opts = $8.route.pool_opts; - if ($8.route.key != NULL) - memcpy(&r.route.key, $8.route.key, - sizeof(struct pf_poolhashkey)); - } - if (r.rt) { - decide_address_family($8.route.host, &r.af); - if ((r.route.opts & PF_POOL_TYPEMASK) == - PF_POOL_NONE && ($8.route.host->next != NULL || - $8.route.host->addr.type == PF_ADDR_TABLE || - DYNIF_MULTIADDR($8.route.host->addr))) - r.route.opts |= PF_POOL_ROUNDROBIN; - if ($8.route.host->next != NULL) { - if (!PF_POOL_DYNTYPE(r.route.opts)) { - yyerror("address pool option " - "not supported by type"); - YYERROR; - } - } - /* fake redirspec */ - if (($8.rroute.rdr = calloc(1, - sizeof(*$8.rroute.rdr))) == NULL) - err(1, "$8.rroute.rdr"); - $8.rroute.rdr->host = $8.route.host; + r.rt = $8.rt; } if (expand_divertspec(&r, &$8.divert)) @@ -2137,30 +2104,14 @@ filter_opt : USER uids { sizeof(filter_opts.nat.pool_opts)); filter_opts.nat.pool_opts.staticport = 1; } - | ROUTETO routespec pool_opts { - filter_opts.route.host = $2; -
Re: pf route-to issues
On Mon, Oct 19, 2020 at 12:33:25PM +0100, Stuart Henderson wrote: > On 2020/10/19 19:53, David Gwynne wrote: > > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > > > On 2020/10/19 15:35, David Gwynne wrote: > > > > every few years i try and use route-to in pf, and every time it > > > > goes badly. i tried it again last week in a slightly different > > > > setting, and actually tried to understand the sharp edges i hit > > > > this time instead of giving up. it turns out there are 2 or 3 > > > > different things together that have cause me trouble, which is why > > > > the diff below is so big. > > > > > > I used to route-to/reply-to quite a lot at places with poor internet > > > connections to split traffic between lines (mostly those have better > > > connections now so I don't need it as often). It worked as I expected - > > > but I only ever used it with the interface specified. > > > > cool. did it work beyond the first packet in a connection? > > It must have done. The webcams would have utterly broken the rest of > traffic if it hadn't :) > > > > I mostly used it with pppoe interfaces so the peer address was unknown > > > at ruleset load time. (I was lucky and had static IPs my side, but the > > > ISP side was variable). I relied on the fact that once packets are > > > directed at a point-point interface there's only one place for them to > > > go. I didn't notice that ":peer" might be useful here (and the syntax > > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > > > (though actually I think it works with $any_random_address@pppoeX). > > > > yes. i was trying to use it with peers over ethernet, and always > > struggled with the syntax. > > > > > > the first and i would argue most fundamental problem is a semantic > > > > problem. if you ask a random person who has some clue about networks > > > > and routing what they would expect the "argument" to route-to or > > > > reply-to to be, they would say "a nexthop address" or "a gateway > > > > address". eg, say i want to force packets to a specific backend > > > > server without using NAT, i would write a rule like this: > > > > > > > > n_servers="192.0.2.128/27" > > > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > > > > > pfctl will happily parse this, shove it into the kernel, let you read > > > > the rules back out again with pfctl -sr, and it all looks plausible, but > > > > it turns out that it's using the argument to route-to as an interface > > > > name. because rulesets can refer to interfaces that don't exist yet, pf > > > > just passes the IP address around as a string, hoping i'll plug in an > > > > interface with a driver name that looks like an ip address. i spent > > > > literally a day trying to figure out why a rule like this wasn't > > > > working. > > > > > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > > > supposed to work. So either doc or implementation bug there. > > > > im leaning toward implementation bug. > > > > > route = ( "route-to" | "reply-to" | "dup-to" ) > > > ( routehost | "{" routehost-list "}" ) > > > [ pooltype ] > > > > > > routehost-list = routehost [ [ "," ] routehost-list ] > > > > > > routehost = host | host "@" interface-name | > > > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > > > > > the second problem is that the pf_route calls from pfsync don't > > > > have all the information it is supposed to have. more specifically, > > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > > > isn't set because pfsync doesnt track which interface a packet is > > > > going out, it assumes the ip layer will get it right again later, or a > > > > rule provided something usable. > > > > > > > > the third problem is that pf_route relies on information from rules to > > > > work correctly. this is a problem in a pfsync environment because you > > > > cannot have the same ruleset on both firewalls 100% of the time, which > > > > means you cannot have route-to/reply-to behave consistently on a pair of > > > > firwalls 100% of the time. > > > > > > I didn't run into this because pppoe(4) and pfsync/carp don't really > > > go well together, but ouch! > > > > > > > all of this together makes things work pretty obviously and smoothly. > > > > in my opinion anyway. route-to now works more like rdr-to, it just > > > > feels like it changes the address used for the route lookup rather > > > > than changing the actual IP address in the packet. it also works > > > > predictably in a pfsync pair, which is great from the point of view of > > > > high availability. > > > > > > > > the main caveat is that it's not backward compatible. if you're already > > > > using route-to, you will need to tweak your rules to have them par
Re: pf route-to issues
On 2020/10/19 19:53, David Gwynne wrote: > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > > On 2020/10/19 15:35, David Gwynne wrote: > > > every few years i try and use route-to in pf, and every time it > > > goes badly. i tried it again last week in a slightly different > > > setting, and actually tried to understand the sharp edges i hit > > > this time instead of giving up. it turns out there are 2 or 3 > > > different things together that have cause me trouble, which is why > > > the diff below is so big. > > > > I used to route-to/reply-to quite a lot at places with poor internet > > connections to split traffic between lines (mostly those have better > > connections now so I don't need it as often). It worked as I expected - > > but I only ever used it with the interface specified. > > cool. did it work beyond the first packet in a connection? It must have done. The webcams would have utterly broken the rest of traffic if it hadn't :) > > I mostly used it with pppoe interfaces so the peer address was unknown > > at ruleset load time. (I was lucky and had static IPs my side, but the > > ISP side was variable). I relied on the fact that once packets are > > directed at a point-point interface there's only one place for them to > > go. I didn't notice that ":peer" might be useful here (and the syntax > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > > (though actually I think it works with $any_random_address@pppoeX). > > yes. i was trying to use it with peers over ethernet, and always > struggled with the syntax. > > > > the first and i would argue most fundamental problem is a semantic > > > problem. if you ask a random person who has some clue about networks > > > and routing what they would expect the "argument" to route-to or > > > reply-to to be, they would say "a nexthop address" or "a gateway > > > address". eg, say i want to force packets to a specific backend > > > server without using NAT, i would write a rule like this: > > > > > > n_servers="192.0.2.128/27" > > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > > > pfctl will happily parse this, shove it into the kernel, let you read > > > the rules back out again with pfctl -sr, and it all looks plausible, but > > > it turns out that it's using the argument to route-to as an interface > > > name. because rulesets can refer to interfaces that don't exist yet, pf > > > just passes the IP address around as a string, hoping i'll plug in an > > > interface with a driver name that looks like an ip address. i spent > > > literally a day trying to figure out why a rule like this wasn't > > > working. > > > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > > supposed to work. So either doc or implementation bug there. > > im leaning toward implementation bug. > > > route = ( "route-to" | "reply-to" | "dup-to" ) > > ( routehost | "{" routehost-list "}" ) > > [ pooltype ] > > > > routehost-list = routehost [ [ "," ] routehost-list ] > > > > routehost = host | host "@" interface-name | > > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > > > the second problem is that the pf_route calls from pfsync don't > > > have all the information it is supposed to have. more specifically, > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > > isn't set because pfsync doesnt track which interface a packet is > > > going out, it assumes the ip layer will get it right again later, or a > > > rule provided something usable. > > > > > > the third problem is that pf_route relies on information from rules to > > > work correctly. this is a problem in a pfsync environment because you > > > cannot have the same ruleset on both firewalls 100% of the time, which > > > means you cannot have route-to/reply-to behave consistently on a pair of > > > firwalls 100% of the time. > > > > I didn't run into this because pppoe(4) and pfsync/carp don't really > > go well together, but ouch! > > > > > all of this together makes things work pretty obviously and smoothly. > > > in my opinion anyway. route-to now works more like rdr-to, it just > > > feels like it changes the address used for the route lookup rather > > > than changing the actual IP address in the packet. it also works > > > predictably in a pfsync pair, which is great from the point of view of > > > high availability. > > > > > > the main caveat is that it's not backward compatible. if you're already > > > using route-to, you will need to tweak your rules to have them parse. > > > however, i doubt anyone is using this stuff because it feels very broken > > > to me. > > > > Do you expect this to work with a bracketed "address" to defer lookup > > until rule evaluation time? i.e. > > > > pass out proto tcp to any p
Re: pf route-to issues
On Mon, Oct 19, 2020 at 12:28:19PM +0200, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > > > next-hop IP address in route-to may be different for each PF box > > > linked by pfsync(4). To be honest I have no answer to address this > > > issue at the moment. > > > > i have thought about that a little bit. we could play with what the > > argument to route-to means. rather than requiring it to be a directly > > connected host/gateway address, we could interpret it as a destination > > address, and use the gateway for that destination as the nexthop. > > > > eg, if i have the following routing table on frontend a: > > > > Internet: > > DestinationGatewayFlags Refs Use Mtu Prio > > Iface > > default192.168.96.33 UGS6 176171 - 8 > > vmx0 > > 224/4 127.0.0.1 URS00 32768 8 lo0 > > > > 10.0.0.0/30192.168.0.1UGS00 - 8 > > gre0 > > > > > and this routing table on frontend b: > > > > Internet: > > DestinationGatewayFlags Refs Use Mtu Prio > > Iface > > default192.168.96.33 UGS987548 - 8 > > aggr0 > > 224/4 127.0.0.1 URS00 32768 8 lo0 > > > > 10.0.0.0/30192.168.0.3UGS00 - 8 > > gre0 > > > > > if gre0 on both frontends pointed at different legs on the same backend > > server, i could write a pf rule like this: > > > > pass out to port 80 route-to 10.0.0.1 > > > > 10.0.0.1 would then end up as the rt_addr field in pf_state and > > pfsync_state. > > > > both frontend a and b would lookup the route to 10.0.0.1, and then > > use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. > > both would end up pushing the packet over their gre link to the > > same backend. the same semantic would work if the link to the backend > > was over ethernet instead of a tunnel. > > > > > > thoughts? > > > > > > > > > > What you've said makes sense. However I still feel pfsync(4) > > > does not play well with route-to. > > > > maybe your opinion is different if the above makes sense? > > > > Thanks for detailed explanation. This is good enough to make me happy. > The remaining questions on this are sort of 'homework for me' to poke > to PF source code, for example: > are we doing route look up for every packet? or route look up > is performed when state is created/imported? (and we cache > outbound interface + next-hop along the state) the "destination" address is determined when the state is created and stored in rt_addr if pf_state. the route lookup using that address is done per packet in pf_route. > also what happens when route does not exist on pfsync peer, which > receives state? How admin will discover state failed to import? the route lookup or interface lookup fails and the packet is dropped. there are no counters to show this is happening though :( > Anyway, your plan above looks solid to me now. It's certainly more > flexible > (?reliable?) to select route to particular destination, than using pair of > interface,next-hop. cool, i'll keep working on it then. > > thanks and > regards > sashan
Re: pf route-to issues
Hello, > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > > next-hop IP address in route-to may be different for each PF box > > linked by pfsync(4). To be honest I have no answer to address this > > issue at the moment. > > i have thought about that a little bit. we could play with what the > argument to route-to means. rather than requiring it to be a directly > connected host/gateway address, we could interpret it as a destination > address, and use the gateway for that destination as the nexthop. > > eg, if i have the following routing table on frontend a: > > Internet: > DestinationGatewayFlags Refs Use Mtu Prio Iface > default192.168.96.33 UGS6 176171 - 8 vmx0 > 224/4 127.0.0.1 URS00 32768 8 lo0 > 10.0.0.0/30192.168.0.1UGS00 - 8 gre0 > > and this routing table on frontend b: > > Internet: > DestinationGatewayFlags Refs Use Mtu Prio Iface > default192.168.96.33 UGS987548 - 8 aggr0 > 224/4 127.0.0.1 URS00 32768 8 lo0 > 10.0.0.0/30192.168.0.3UGS00 - 8 gre0 > > if gre0 on both frontends pointed at different legs on the same backend > server, i could write a pf rule like this: > > pass out to port 80 route-to 10.0.0.1 > > 10.0.0.1 would then end up as the rt_addr field in pf_state and > pfsync_state. > > both frontend a and b would lookup the route to 10.0.0.1, and then > use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. > both would end up pushing the packet over their gre link to the > same backend. the same semantic would work if the link to the backend > was over ethernet instead of a tunnel. > > > > thoughts? > > > > > > > What you've said makes sense. However I still feel pfsync(4) > > does not play well with route-to. > > maybe your opinion is different if the above makes sense? > Thanks for detailed explanation. This is good enough to make me happy. The remaining questions on this are sort of 'homework for me' to poke to PF source code, for example: are we doing route look up for every packet? or route look up is performed when state is created/imported? (and we cache outbound interface + next-hop along the state) also what happens when route does not exist on pfsync peer, which receives state? How admin will discover state failed to import? Anyway, your plan above looks solid to me now. It's certainly more flexible (?reliable?) to select route to particular destination, than using pair of interface,next-hop. thanks and regards sashan
Re: pf route-to issues
On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > On 2020/10/19 15:35, David Gwynne wrote: > > every few years i try and use route-to in pf, and every time it > > goes badly. i tried it again last week in a slightly different > > setting, and actually tried to understand the sharp edges i hit > > this time instead of giving up. it turns out there are 2 or 3 > > different things together that have cause me trouble, which is why > > the diff below is so big. > > I used to route-to/reply-to quite a lot at places with poor internet > connections to split traffic between lines (mostly those have better > connections now so I don't need it as often). It worked as I expected - > but I only ever used it with the interface specified. cool. did it work beyond the first packet in a connection? > I mostly used it with pppoe interfaces so the peer address was unknown > at ruleset load time. (I was lucky and had static IPs my side, but the > ISP side was variable). I relied on the fact that once packets are > directed at a point-point interface there's only one place for them to > go. I didn't notice that ":peer" might be useful here (and the syntax > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > (though actually I think it works with $any_random_address@pppoeX). yes. i was trying to use it with peers over ethernet, and always struggled with the syntax. > > the first and i would argue most fundamental problem is a semantic > > problem. if you ask a random person who has some clue about networks > > and routing what they would expect the "argument" to route-to or > > reply-to to be, they would say "a nexthop address" or "a gateway > > address". eg, say i want to force packets to a specific backend > > server without using NAT, i would write a rule like this: > > > > n_servers="192.0.2.128/27" > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > pfctl will happily parse this, shove it into the kernel, let you read > > the rules back out again with pfctl -sr, and it all looks plausible, but > > it turns out that it's using the argument to route-to as an interface > > name. because rulesets can refer to interfaces that don't exist yet, pf > > just passes the IP address around as a string, hoping i'll plug in an > > interface with a driver name that looks like an ip address. i spent > > literally a day trying to figure out why a rule like this wasn't > > working. > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > supposed to work. So either doc or implementation bug there. im leaning toward implementation bug. > route = ( "route-to" | "reply-to" | "dup-to" ) > ( routehost | "{" routehost-list "}" ) > [ pooltype ] > > routehost-list = routehost [ [ "," ] routehost-list ] > > routehost = host | host "@" interface-name | > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > the second problem is that the pf_route calls from pfsync don't > > have all the information it is supposed to have. more specifically, > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > isn't set because pfsync doesnt track which interface a packet is > > going out, it assumes the ip layer will get it right again later, or a > > rule provided something usable. > > > > the third problem is that pf_route relies on information from rules to > > work correctly. this is a problem in a pfsync environment because you > > cannot have the same ruleset on both firewalls 100% of the time, which > > means you cannot have route-to/reply-to behave consistently on a pair of > > firwalls 100% of the time. > > I didn't run into this because pppoe(4) and pfsync/carp don't really > go well together, but ouch! > > > all of this together makes things work pretty obviously and smoothly. > > in my opinion anyway. route-to now works more like rdr-to, it just > > feels like it changes the address used for the route lookup rather > > than changing the actual IP address in the packet. it also works > > predictably in a pfsync pair, which is great from the point of view of > > high availability. > > > > the main caveat is that it's not backward compatible. if you're already > > using route-to, you will need to tweak your rules to have them parse. > > however, i doubt anyone is using this stuff because it feels very broken > > to me. > > Do you expect this to work with a bracketed "address" to defer lookup > until rule evaluation time? i.e. > > pass out proto tcp to any port 22 route-to (pppoe1:peer) in my opinion route-to should be able to take whatever rdr-to accepts. however, i just tried it and it doesnt currently work, but i'm sure i can figure it out. > I think that will be all that's needed to allow converting the pppoe > use case. I don't have a multiple pppoe setup handy bu
Re: pf route-to issues
On Mon, Oct 19, 2020 at 09:46:19AM +0200, Alexandr Nedvedicky wrote: > Hello, > > disclaimer: I have no chance to run pfsync on production, I'm very > inexperienced with pfsync(4). i wrote the defer code in pfsync, and i think i wrote the code in pfsync that calls pf_route badly, so noones perfect :) > > > > > the third problem is that pf_route relies on information from rules to > > work correctly. this is a problem in a pfsync environment because you > > cannot have the same ruleset on both firewalls 100% of the time, which > > means you cannot have route-to/reply-to behave consistently on a pair of > > firwalls 100% of the time. > > > > my solution to both these problems is reduce the amount of information > > pf_route needs to work with, to make sure that the info it does need > > is in the pf state structure, and that pfsync handles it properly. > > > > if we limit the information needed for pf_route to a nexthop address, > > and which direction the address is used, this is doable. both the > > pf_state and pfsync_state structs already contain an address to store a > > nexthop in, i just had to move the route-to direction from the rule into > > the state. this is easy with pf_state, but i used a spare pad field in > > pfsync_state for this. > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > next-hop IP address in route-to may be different for each PF box > linked by pfsync(4). To be honest I have no answer to address this > issue at the moment. i have thought about that a little bit. we could play with what the argument to route-to means. rather than requiring it to be a directly connected host/gateway address, we could interpret it as a destination address, and use the gateway for that destination as the nexthop. eg, if i have the following routing table on frontend a: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default192.168.96.33 UGS6 176171 - 8 vmx0 224/4 127.0.0.1 URS00 32768 8 lo0 10.0.0.0/30192.168.0.1UGS00 - 8 gre0 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 142 32768 1 lo0 192.168.0.0192.168.0.0UHl00 - 1 gre0 192.168.0.1192.168.0.0UHh11 - 8 gre0 192.168.96.32/27 192.168.96.34 UCn2 122849 - 4 vmx0 192.168.96.33 00:00:5e:00:01:47 UHLch 114611 - 3 vmx0 192.168.96.34 00:50:56:a1:73:91 UHLl 0 362231 - 1 vmx0 192.168.96.60 fe:e1:ba:d0:74:ef UHLc 059690 - 3 vmx0 192.168.96.63 192.168.96.34 UHb00 - 1 vmx0 and this routing table on frontend b: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default192.168.96.33 UGS987548 - 8 aggr0 224/4 127.0.0.1 URS00 32768 8 lo0 10.0.0.0/30192.168.0.3UGS00 - 8 gre0 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 62 32768 1 lo0 192.168.0.2192.168.0.2UHl00 - 1 gre0 192.168.0.3192.168.0.2UHh11 - 8 gre0 192.168.96.32/27 192.168.96.55 UCn362186 - 4 aggr0 192.168.96.33 00:00:5e:00:01:47 UHLch 1 7442 - 3 aggr0 192.168.96.35 00:23:42:d0:56:8e UHLc 0 1905 - 3 aggr0 192.168.96.55 fe:e1:ba:d0:e0:83 UHLl 0 178119 - 1 aggr0 192.168.96.60 fe:e1:ba:d0:74:ef UHLc 031021 - 3 aggr0 192.168.96.63 192.168.96.55 UHb00 - 1 aggr0 if gre0 on both frontends pointed at different legs on the same backend server, i could write a pf rule like this: pass out to port 80 route-to 10.0.0.1 10.0.0.1 would then end up as the rt_addr field in pf_state and pfsync_state. both frontend a and b would lookup the route to 10.0.0.1, and then use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. both would end up pushing the packet over their gre link to the same backend. the same semantic would work if the link to the backend was over ethernet instead of a tunnel. > > thoughts? > > > > What you've said makes sense. However I still feel pfsync(4) > does not play well with route-to. maybe your opinion is different if the above makes sense? > thanks and > regards > sashan no, thank you for reading my long email. cheers, dlg
Re: pf route-to issues
On 2020/10/19 15:35, David Gwynne wrote: > every few years i try and use route-to in pf, and every time it > goes badly. i tried it again last week in a slightly different > setting, and actually tried to understand the sharp edges i hit > this time instead of giving up. it turns out there are 2 or 3 > different things together that have cause me trouble, which is why > the diff below is so big. I used to route-to/reply-to quite a lot at places with poor internet connections to split traffic between lines (mostly those have better connections now so I don't need it as often). It worked as I expected - but I only ever used it with the interface specified. I mostly used it with pppoe interfaces so the peer address was unknown at ruleset load time. (I was lucky and had static IPs my side, but the ISP side was variable). I relied on the fact that once packets are directed at a point-point interface there's only one place for them to go. I didn't notice that ":peer" might be useful here (and the syntax 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc (though actually I think it works with $any_random_address@pppoeX). > the first and i would argue most fundamental problem is a semantic > problem. if you ask a random person who has some clue about networks > and routing what they would expect the "argument" to route-to or > reply-to to be, they would say "a nexthop address" or "a gateway > address". eg, say i want to force packets to a specific backend > server without using NAT, i would write a rule like this: > > n_servers="192.0.2.128/27" > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > pfctl will happily parse this, shove it into the kernel, let you read > the rules back out again with pfctl -sr, and it all looks plausible, but > it turns out that it's using the argument to route-to as an interface > name. because rulesets can refer to interfaces that don't exist yet, pf > just passes the IP address around as a string, hoping i'll plug in an > interface with a driver name that looks like an ip address. i spent > literally a day trying to figure out why a rule like this wasn't > working. I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's supposed to work. So either doc or implementation bug there. route = ( "route-to" | "reply-to" | "dup-to" ) ( routehost | "{" routehost-list "}" ) [ pooltype ] routehost-list = routehost [ [ "," ] routehost-list ] routehost = host | host "@" interface-name | "(" interface-name [ address [ "/" mask-bits ] ] ")" > the second problem is that the pf_route calls from pfsync don't > have all the information it is supposed to have. more specifically, > an ifp pointer isn't set which leads to a segfault. the ifp pointer > isn't set because pfsync doesnt track which interface a packet is > going out, it assumes the ip layer will get it right again later, or a > rule provided something usable. > > the third problem is that pf_route relies on information from rules to > work correctly. this is a problem in a pfsync environment because you > cannot have the same ruleset on both firewalls 100% of the time, which > means you cannot have route-to/reply-to behave consistently on a pair of > firwalls 100% of the time. I didn't run into this because pppoe(4) and pfsync/carp don't really go well together, but ouch! > all of this together makes things work pretty obviously and smoothly. > in my opinion anyway. route-to now works more like rdr-to, it just > feels like it changes the address used for the route lookup rather > than changing the actual IP address in the packet. it also works > predictably in a pfsync pair, which is great from the point of view of > high availability. > > the main caveat is that it's not backward compatible. if you're already > using route-to, you will need to tweak your rules to have them parse. > however, i doubt anyone is using this stuff because it feels very broken > to me. Do you expect this to work with a bracketed "address" to defer lookup until rule evaluation time? i.e. pass out proto tcp to any port 22 route-to (pppoe1:peer) I think that will be all that's needed to allow converting the pppoe use case. I don't have a multiple pppoe setup handy but I can probably hack together some sort of test. I've also used route-to with squid "transparent" proxying (shown in the pkg-readme), I don't do that any more but I can put a squid test together easily enough. > @@ -1842,37 +1833,18 @@ pfrule: action dir logquick interface > decide_address_family($7.src.host, &r.af); > decide_address_family($7.dst.host, &r.af); > > - if ($8.route.rt) { > + if ($8.rt) { > + if ($8.rt != PF_DUPTO && !r.direction) { >
Re: pf route-to issues
Hello, disclaimer: I have no chance to run pfsync on production, I'm very inexperienced with pfsync(4). > > the third problem is that pf_route relies on information from rules to > work correctly. this is a problem in a pfsync environment because you > cannot have the same ruleset on both firewalls 100% of the time, which > means you cannot have route-to/reply-to behave consistently on a pair of > firwalls 100% of the time. > > my solution to both these problems is reduce the amount of information > pf_route needs to work with, to make sure that the info it does need > is in the pf state structure, and that pfsync handles it properly. > > if we limit the information needed for pf_route to a nexthop address, > and which direction the address is used, this is doable. both the > pf_state and pfsync_state structs already contain an address to store a > nexthop in, i just had to move the route-to direction from the rule into > the state. this is easy with pf_state, but i used a spare pad field in > pfsync_state for this. > it seems to me 'route-to vs. pfsync' still needs more thought. the next-hop IP address in route-to may be different for each PF box linked by pfsync(4). To be honest I have no answer to address this issue at the moment. > > thoughts? > What you've said makes sense. However I still feel pfsync(4) does not play well with route-to. thanks and regards sashan
pf route-to issues
every few years i try and use route-to in pf, and every time it goes badly. i tried it again last week in a slightly different setting, and actually tried to understand the sharp edges i hit this time instead of giving up. it turns out there are 2 or 3 different things together that have cause me trouble, which is why the diff below is so big. the first and i would argue most fundamental problem is a semantic problem. if you ask a random person who has some clue about networks and routing what they would expect the "argument" to route-to or reply-to to be, they would say "a nexthop address" or "a gateway address". eg, say i want to force packets to a specific backend server without using NAT, i would write a rule like this: n_servers="192.0.2.128/27" pass out on $if_internal to $n_servers route-to 192.168.0.1 pfctl will happily parse this, shove it into the kernel, let you read the rules back out again with pfctl -sr, and it all looks plausible, but it turns out that it's using the argument to route-to as an interface name. because rulesets can refer to interfaces that don't exist yet, pf just passes the IP address around as a string, hoping i'll plug in an interface with a driver name that looks like an ip address. i spent literally a day trying to figure out why a rule like this wasn't working. i happened to be talking to pascoe@ at the time, and his vague memory was that the idea was to try and switch the interface a packet was going to travel over, but to try and reuse the arp lookup from the parent one. neither of us could figure out why that would be a good idea though. the best i can say about this is that it only really makes some kind of sense if you're moving a packet into a tunnel. tunnels don't really care about nexthops and will happily route anything you give them. if you were trying to add a route to the routing table to do this, you'd be specifying the peer address on a tunnel interface as the gateway. pf has a if0:peer syntax that makes this convenient to write. so i want to change route-to in pfctl so it takes a nexthop instead of an interface. you could argue that pf already lets you do this, because there's some bs nexthop@interface syntax. my counter argument is that the interface the nexthop is reachable over is redundant, and it makes fixing some of the other problems harder if we keep it. the second and third problems i hit are when route-to is used on a pair of boxes that have pfsync and pfsync defer set up. when defer is enabled, pfsync takes the packet away from the forwarding path, and when it has some confidence that the peer is aware of the state, then it tries to push the packet back out. to understand the following, be aware that route-to, reply-to, and dup-to are implemented in pf in a pair of functions called pf_route and pf_route6. if i say pf_route, just assume i'm talking about both of these functions. the second problem is that the pf_route calls from pfsync don't have all the information it is supposed to have. more specifically, an ifp pointer isn't set which leads to a segfault. the ifp pointer isn't set because pfsync doesnt track which interface a packet is going out, it assumes the ip layer will get it right again later, or a rule provided something usable. the third problem is that pf_route relies on information from rules to work correctly. this is a problem in a pfsync environment because you cannot have the same ruleset on both firewalls 100% of the time, which means you cannot have route-to/reply-to behave consistently on a pair of firwalls 100% of the time. my solution to both these problems is reduce the amount of information pf_route needs to work with, to make sure that the info it does need is in the pf state structure, and that pfsync handles it properly. if we limit the information needed for pf_route to a nexthop address, and which direction the address is used, this is doable. both the pf_state and pfsync_state structs already contain an address to store a nexthop in, i just had to move the route-to direction from the rule into the state. this is easy with pf_state, but i used a spare pad field in pfsync_state for this. the pf_state struct has had which interface the route is using removed. there's no simple way to sync interface information between pfsync peers on the wire, and the need for them is marginal at best. things are much simpler if we can get away with not having this info. a bonus problem i hit is that there's code in pf_match that appears to try and short circuit some processing of states when route-to/reply-to is in effect. this has two consequences. first, if you're using route-to with tcp states, half the tcp state machine is is skipped. when you look at these states with pfctl -vvss, one half of the TCP state never moves forward. secondly, because the processing is short circuited, it never falls through to the end of pf_test where the actual call to pf_route is done. so the first packet is properly handled by pf_