On Mon, Oct 19, 2020 at 12:33:25PM +0100, Stuart Henderson wrote:
> On 2020/10/19 19:53, David Gwynne wrote:
> > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote:
> > > On 2020/10/19 15:35, David Gwynne wrote:
> > > > every few years i try and use route-to in pf, and every time it
> > > > goes badly. i tried it again last week in a slightly different
> > > > setting, and actually tried to understand the sharp edges i hit
> > > > this time instead of giving up. it turns out there are 2 or 3
> > > > different things together that have cause me trouble, which is why
> > > > the diff below is so big.
> > >
> > > I used to route-to/reply-to quite a lot at places with poor internet
> > > connections to split traffic between lines (mostly those have better
> > > connections now so I don't need it as often). It worked as I expected -
> > > but I only ever used it with the interface specified.
> >
> > cool. did it work beyond the first packet in a connection?
>
> It must have done. The webcams would have utterly broken the rest of
> traffic if it hadn't :)
>
> > > I mostly used it with pppoe interfaces so the peer address was unknown
> > > at ruleset load time. (I was lucky and had static IPs my side, but the
> > > ISP side was variable). I relied on the fact that once packets are
> > > directed at a point-point interface there's only one place for them to
> > > go. I didn't notice that ":peer" might be useful here (and the syntax
> > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't
> > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc
> > > (though actually I think it works with $any_random_address@pppoeX).
> >
> > yes. i was trying to use it with peers over ethernet, and always
> > struggled with the syntax.
> >
> > > > the first and i would argue most fundamental problem is a semantic
> > > > problem. if you ask a random person who has some clue about networks
> > > > and routing what they would expect the "argument" to route-to or
> > > > reply-to to be, they would say "a nexthop address" or "a gateway
> > > > address". eg, say i want to force packets to a specific backend
> > > > server without using NAT, i would write a rule like this:
> > > >
> > > > n_servers="192.0.2.128/27"
> > > > pass out on $if_internal to $n_servers route-to 192.168.0.1
> > > >
> > > > pfctl will happily parse this, shove it into the kernel, let you read
> > > > the rules back out again with pfctl -sr, and it all looks plausible, but
> > > > it turns out that it's using the argument to route-to as an interface
> > > > name. because rulesets can refer to interfaces that don't exist yet, pf
> > > > just passes the IP address around as a string, hoping i'll plug in an
> > > > interface with a driver name that looks like an ip address. i spent
> > > > literally a day trying to figure out why a rule like this wasn't
> > > > working.
> > >
> > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's
> > > supposed to work. So either doc or implementation bug there.
> >
> > im leaning toward implementation bug.
> >
> > > route = ( "route-to" | "reply-to" | "dup-to" )
> > > ( routehost | "{" routehost-list "}" )
> > > [ pooltype ]
> > >
> > > routehost-list = routehost [ [ "," ] routehost-list ]
> > >
> > > routehost = host | host "@" interface-name |
> > > "(" interface-name [ address [ "/" mask-bits ] ] ")"
> > >
> > > > the second problem is that the pf_route calls from pfsync don't
> > > > have all the information it is supposed to have. more specifically,
> > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer
> > > > isn't set because pfsync doesnt track which interface a packet is
> > > > going out, it assumes the ip layer will get it right again later, or a
> > > > rule provided something usable.
> > > >
> > > > the third problem is that pf_route relies on information from rules to
> > > > work correctly. this is a problem in a pfsync environment because you
> > > > cannot have the same ruleset on both firewalls 100% of the time, which
> > > > means you cannot have route-to/reply-to behave consistently on a pair of
> > > > firwalls 100% of the time.
> > >
> > > I didn't run into this because pppoe(4) and pfsync/carp don't really
> > > go well together, but ouch!
> > >
> > > > all of this together makes things work pretty obviously and smoothly.
> > > > in my opinion anyway. route-to now works more like rdr-to, it just
> > > > feels like it changes the address used for the route lookup rather
> > > > than changing the actual IP address in the packet. it also works
> > > > predictably in a pfsync pair, which is great from the point of view of
> > > > high availability.
> > > >
> > > > the main caveat is that it's not backward compatible. if you're already
> > > > using route-to, you will need to tweak your rules to have them parse.
> > > > however, i doubt anyone is using this stuff because it feels very broken
> > > > to me.
> > >
> > > Do you expect this to work with a bracketed "address" to defer lookup
> > > until rule evaluation time? i.e.
> > >
> > > pass out proto tcp to any port 22 route-to (pppoe1:peer)
> >
> > in my opinion route-to should be able to take whatever rdr-to accepts.
> > however, i just tried it and it doesnt currently work, but i'm sure i
> > can figure it out.
i think i tried it with the wrong pfctl. it does appear to work as
we want.
> >
> > > I think that will be all that's needed to allow converting the pppoe
> > > use case. I don't have a multiple pppoe setup handy but I can probably
> > > hack together some sort of test.
> > >
> > > I've also used route-to with squid "transparent" proxying (shown in
> > > the pkg-readme), I don't do that any more but I can put a squid test
> > > together easily enough.
> >
> > it looks doable.
> >
> > my changes will break route-to with "no state" rules though. should
> > i try and keep those working too?
>
> I don't think I ever used that myself. aja@ added the "no state" in
> squid's pkg-readme (to avoid state problems when hairpinning the
> intercepted packets back to a machine on a subnet alongside the client,
> squid replying to the client directly), no idea if he's still using that,
> I would guess probably not.
>
> If I understand the problem correctly I think this case could use "keep
> state (sloppy)" instead anyway.
cool.
here's a tweaked diff. i also checked that it did the thing sashan@ and
i were talking about where the route-to address is used as a
destination, and the gateway for that destination is used for the
routing.
i am feeling very warm and fuzzy about this diff at the moment.
Index: sbin/pfctl/parse.y
===================================================================
RCS file: /cvs/src/sbin/pfctl/parse.y,v
retrieving revision 1.704
diff -u -p -r1.704 parse.y
--- sbin/pfctl/parse.y 1 Oct 2020 14:02:08 -0000 1.704
+++ sbin/pfctl/parse.y 19 Oct 2020 23:19:22 -0000
@@ -276,6 +276,7 @@ struct filter_opts {
struct redirspec nat;
struct redirspec rdr;
struct redirspec rroute;
+ u_int8_t rt;
/* scrub opts */
int nodf;
@@ -284,15 +285,6 @@ struct filter_opts {
int randomid;
int max_mss;
- /* route opts */
- struct {
- struct node_host *host;
- u_int8_t rt;
- u_int8_t pool_opts;
- sa_family_t af;
- struct pf_poolhashkey *key;
- } route;
-
struct {
u_int32_t limit;
u_int32_t seconds;
@@ -517,7 +509,6 @@ int parseport(char *, struct range *r, i
%type <v.host> ipspec xhost host dynaddr host_list
%type <v.host> table_host_list tablespec
%type <v.host> redir_host_list redirspec
-%type <v.host> route_host route_host_list routespec
%type <v.os> os xos os_list
%type <v.port> portspec port_list port_item
%type <v.uid> uids uid_list uid_item
@@ -974,7 +965,7 @@ anchorrule : ANCHOR anchorname dir quick
YYERROR;
}
- if ($9.route.rt) {
+ if ($9.rt) {
yyerror("cannot specify route handling "
"on anchors");
YYERROR;
@@ -1842,37 +1833,13 @@ pfrule : action dir logquick interface
decide_address_family($7.src.host, &r.af);
decide_address_family($7.dst.host, &r.af);
- if ($8.route.rt) {
- if (!r.direction) {
+ if ($8.rt) {
+ if ($8.rt != PF_DUPTO && !r.direction) {
yyerror("direction must be explicit "
"with rules that specify routing");
YYERROR;
}
- r.rt = $8.route.rt;
- r.route.opts = $8.route.pool_opts;
- if ($8.route.key != NULL)
- memcpy(&r.route.key, $8.route.key,
- sizeof(struct pf_poolhashkey));
- }
- if (r.rt) {
- decide_address_family($8.route.host, &r.af);
- if ((r.route.opts & PF_POOL_TYPEMASK) ==
- PF_POOL_NONE && ($8.route.host->next !=
NULL ||
- $8.route.host->addr.type == PF_ADDR_TABLE ||
- DYNIF_MULTIADDR($8.route.host->addr)))
- r.route.opts |= PF_POOL_ROUNDROBIN;
- if ($8.route.host->next != NULL) {
- if (!PF_POOL_DYNTYPE(r.route.opts)) {
- yyerror("address pool option "
- "not supported by type");
- YYERROR;
- }
- }
- /* fake redirspec */
- if (($8.rroute.rdr = calloc(1,
- sizeof(*$8.rroute.rdr))) == NULL)
- err(1, "$8.rroute.rdr");
- $8.rroute.rdr->host = $8.route.host;
+ r.rt = $8.rt;
}
if (expand_divertspec(&r, &$8.divert))
@@ -2136,30 +2103,14 @@ filter_opt : USER uids {
sizeof(filter_opts.nat.pool_opts));
filter_opts.nat.pool_opts.staticport = 1;
}
- | ROUTETO routespec pool_opts {
- filter_opts.route.host = $2;
- filter_opts.route.rt = PF_ROUTETO;
- filter_opts.route.pool_opts = $3.type | $3.opts;
- memcpy(&filter_opts.rroute.pool_opts, &$3,
- sizeof(filter_opts.rroute.pool_opts));
- if ($3.key != NULL)
- filter_opts.route.key = $3.key;
+ | ROUTETO routespec {
+ filter_opts.rt = PF_ROUTETO;
}
- | REPLYTO routespec pool_opts {
- filter_opts.route.host = $2;
- filter_opts.route.rt = PF_REPLYTO;
- filter_opts.route.pool_opts = $3.type | $3.opts;
- if ($3.key != NULL)
- filter_opts.route.key = $3.key;
- }
- | DUPTO routespec pool_opts {
- filter_opts.route.host = $2;
- filter_opts.route.rt = PF_DUPTO;
- filter_opts.route.pool_opts = $3.type | $3.opts;
- memcpy(&filter_opts.rroute.pool_opts, &$3,
- sizeof(filter_opts.rroute.pool_opts));
- if ($3.key != NULL)
- filter_opts.route.key = $3.key;
+ | REPLYTO routespec {
+ filter_opts.rt = PF_REPLYTO;
+ }
+ | DUPTO routespec {
+ filter_opts.rt = PF_DUPTO;
}
| not RECEIVEDON if_item {
if (filter_opts.rcv) {
@@ -3733,122 +3684,21 @@ pool_opt : BITMASK {
}
;
-route_host : STRING {
- /* try to find @if0 address specs */
- if (strrchr($1, '@') != NULL) {
- if (($$ = host($1, pf->opts)) == NULL) {
- yyerror("invalid host for route spec");
- YYERROR;
- }
- free($1);
- } else {
- $$ = calloc(1, sizeof(struct node_host));
- if ($$ == NULL)
- err(1, "route_host: calloc");
- $$->ifname = $1;
- $$->addr.type = PF_ADDR_NONE;
- set_ipmask($$, 128);
- $$->next = NULL;
- $$->tail = $$;
- }
- }
- | STRING '/' STRING {
- char *buf;
-
- if (asprintf(&buf, "%s/%s", $1, $3) == -1)
- err(1, "host: asprintf");
- free($1);
- if (($$ = host(buf, pf->opts)) == NULL) {
- /* error. "any" is handled elsewhere */
- free(buf);
- yyerror("could not parse host specification");
- YYERROR;
- }
- free(buf);
- }
- | '<' STRING '>' {
- if (strlen($2) >= PF_TABLE_NAME_SIZE) {
- yyerror("table name '%s' too long", $2);
- free($2);
- YYERROR;
- }
- $$ = calloc(1, sizeof(struct node_host));
- if ($$ == NULL)
- err(1, "host: calloc");
- $$->addr.type = PF_ADDR_TABLE;
- if (strlcpy($$->addr.v.tblname, $2,
- sizeof($$->addr.v.tblname)) >=
- sizeof($$->addr.v.tblname))
- errx(1, "host: strlcpy");
- free($2);
- $$->next = NULL;
- $$->tail = $$;
- }
- | dynaddr '/' NUMBER {
- struct node_host *n;
-
- if ($3 < 0 || $3 > 128) {
- yyerror("bit number too big");
- YYERROR;
- }
- $$ = $1;
- for (n = $1; n != NULL; n = n->next)
- set_ipmask(n, $3);
- }
- | '(' STRING host ')' {
- struct node_host *n;
-
- $$ = $3;
- /* XXX check masks, only full mask should be allowed */
- for (n = $3; n != NULL; n = n->next) {
- if ($$->ifname) {
- yyerror("cannot specify interface twice
"
- "in route spec");
- YYERROR;
- }
- if (($$->ifname = strdup($2)) == NULL)
- errx(1, "host: strdup");
- }
- free($2);
- }
- ;
-
-route_host_list : route_host optweight optnl {
- if ($2 > 0) {
- struct node_host *n;
- for (n = $1; n != NULL; n = n->next)
- n->weight = $2;
- }
- $$ = $1;
- }
- | route_host_list comma route_host optweight optnl {
- if ($1->af == 0)
- $1->af = $3->af;
- if ($1->af != $3->af) {
- yyerror("all pool addresses must be in the "
- "same address family");
+routespec : redirspec pool_opts {
+ struct redirection *redir;
+ if (filter_opts.rt != PF_NOPFROUTE) {
+ yyerror("cannot respecify "
+ "route-to/reply-to/dup-to");
YYERROR;
}
- $1->tail->next = $3;
- $1->tail = $3->tail;
- if ($4 > 0) {
- struct node_host *n;
- for (n = $3; n != NULL; n = n->next)
- n->weight = $4;
- }
- $$ = $1;
- }
- ;
-
-routespec : route_host optweight {
- if ($2 > 0) {
- struct node_host *n;
- for (n = $1; n != NULL; n = n->next)
- n->weight = $2;
- }
- $$ = $1;
+ redir = calloc(1, sizeof(*redir));
+ if (redir == NULL)
+ err(1, "routespec calloc");
+ redir->host = $1;
+ filter_opts.rroute.rdr = redir;
+ memcpy(&filter_opts.rroute.pool_opts, &$2,
+ sizeof(filter_opts.rroute.pool_opts));
}
- | '{' optnl route_host_list '}' { $$ = $3; }
;
timeout_spec : STRING NUMBER
@@ -4709,7 +4559,7 @@ expand_rule(struct pf_rule *r, int keepr
error += collapse_redirspec(&r->rdr, r, rdr, 0);
error += collapse_redirspec(&r->nat, r, nat, 0);
- error += collapse_redirspec(&r->route, r, rroute, 1);
+ error += collapse_redirspec(&r->route, r, rroute, 0);
/* disallow @if in from or to for the time being */
if ((src_host->addr.type == PF_ADDR_ADDRMASK &&
@@ -5955,7 +5805,7 @@ filteropts_to_rule(struct pf_rule *r, st
yyerror("af-to can only be used with direction in");
return (1);
}
- if ((opts->marker & FOM_AFTO) && opts->route.rt) {
+ if ((opts->marker & FOM_AFTO) && opts->rt) {
yyerror("af-to cannot be used together with "
"route-to, reply-to, dup-to");
return (1);
Index: sys/net/if_pfsync.c
===================================================================
RCS file: /cvs/src/sys/net/if_pfsync.c,v
retrieving revision 1.278
diff -u -p -r1.278 if_pfsync.c
--- sys/net/if_pfsync.c 24 Aug 2020 15:30:58 -0000 1.278
+++ sys/net/if_pfsync.c 19 Oct 2020 23:19:22 -0000
@@ -612,7 +612,8 @@ pfsync_state_import(struct pfsync_state
st->rtableid[PF_SK_STACK] = ntohl(sp->rtableid[PF_SK_STACK]);
/* copy to state */
- bcopy(&sp->rt_addr, &st->rt_addr, sizeof(st->rt_addr));
+ st->rt_addr = sp->rt_addr;
+ st->rt = sp->rt;
st->creation = getuptime() - ntohl(sp->creation);
st->expire = getuptime();
if (ntohl(sp->expire)) {
@@ -643,7 +644,6 @@ pfsync_state_import(struct pfsync_state
st->rule.ptr = r;
st->anchor.ptr = NULL;
- st->rt_kif = NULL;
st->pfsync_time = getuptime();
st->sync_state = PFSYNC_S_NONE;
@@ -1843,6 +1843,7 @@ pfsync_undefer(struct pfsync_deferral *p
{
struct pfsync_softc *sc = pfsyncif;
struct pf_pdesc pdesc;
+ struct pf_state *s = pd->pd_st;
NET_ASSERT_LOCKED();
@@ -1852,35 +1853,33 @@ pfsync_undefer(struct pfsync_deferral *p
TAILQ_REMOVE(&sc->sc_deferrals, pd, pd_entry);
sc->sc_deferred--;
- CLR(pd->pd_st->state_flags, PFSTATE_ACK);
+ CLR(s->state_flags, PFSTATE_ACK);
if (drop)
m_freem(pd->pd_m);
else {
- if (pd->pd_st->rule.ptr->rt == PF_ROUTETO) {
+ if (s->rt == PF_ROUTETO) {
if (pf_setup_pdesc(&pdesc,
- pd->pd_st->key[PF_SK_WIRE]->af,
- pd->pd_st->direction, pd->pd_st->rt_kif,
+ s->key[PF_SK_WIRE]->af,
+ s->direction, s->kif,
pd->pd_m, NULL) != PF_PASS) {
m_freem(pd->pd_m);
goto out;
}
- switch (pd->pd_st->key[PF_SK_WIRE]->af) {
+ switch (s->key[PF_SK_WIRE]->af) {
case AF_INET:
- pf_route(&pdesc,
- pd->pd_st->rule.ptr, pd->pd_st);
+ pf_route(&pdesc, s);
break;
#ifdef INET6
case AF_INET6:
- pf_route6(&pdesc,
- pd->pd_st->rule.ptr, pd->pd_st);
+ pf_route6(&pdesc, s);
break;
#endif /* INET6 */
default:
- unhandled_af(pd->pd_st->key[PF_SK_WIRE]->af);
+ unhandled_af(s->key[PF_SK_WIRE]->af);
}
pd->pd_m = pdesc.m;
} else {
- switch (pd->pd_st->key[PF_SK_WIRE]->af) {
+ switch (s->key[PF_SK_WIRE]->af) {
case AF_INET:
ip_output(pd->pd_m, NULL, NULL, 0, NULL, NULL,
0);
@@ -1892,12 +1891,12 @@ pfsync_undefer(struct pfsync_deferral *p
break;
#endif /* INET6 */
default:
- unhandled_af(pd->pd_st->key[PF_SK_WIRE]->af);
+ unhandled_af(s->key[PF_SK_WIRE]->af);
}
}
}
out:
- pf_state_unref(pd->pd_st);
+ pf_state_unref(s);
pool_put(&sc->sc_pool, pd);
}
Index: sys/net/pf.c
===================================================================
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1094
diff -u -p -r1.1094 pf.c
--- sys/net/pf.c 24 Jul 2020 18:17:15 -0000 1.1094
+++ sys/net/pf.c 19 Oct 2020 23:19:22 -0000
@@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc
}
*state = s;
- if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif &&
- ((s->rule.ptr->rt == PF_ROUTETO &&
- s->rule.ptr->direction == PF_OUT) ||
- (s->rule.ptr->rt == PF_REPLYTO &&
- s->rule.ptr->direction == PF_IN)))
- return (PF_PASS);
return (PF_MATCH);
}
@@ -1186,7 +1180,8 @@ pf_state_export(struct pfsync_state *sp,
/* copy from state */
strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname));
- memcpy(&sp->rt_addr, &st->rt_addr, sizeof(sp->rt_addr));
+ sp->rt = st->rt;
+ sp->rt_addr = st->rt_addr;
sp->creation = htonl(getuptime() - st->creation);
expire = pf_state_expires(st);
if (expire <= getuptime())
@@ -3433,29 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct
struct pf_rule *r = s->rule.ptr;
int rv;
- s->rt_kif = NULL;
- if (!r->rt)
+ if (r->rt == PF_NOPFROUTE)
return (0);
- switch (af) {
- case AF_INET:
- rv = pf_map_addr(AF_INET, r, saddr, &s->rt_addr, NULL, sns,
- &r->route, PF_SN_ROUTE);
- break;
-#ifdef INET6
- case AF_INET6:
- rv = pf_map_addr(AF_INET6, r, saddr, &s->rt_addr, NULL, sns,
- &r->route, PF_SN_ROUTE);
- break;
-#endif /* INET6 */
- default:
- rv = 1;
- }
-
- if (rv == 0) {
- s->rt_kif = r->route.kif;
- s->natrule.ptr = r;
- }
+ rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns,
+ &r->route, PF_SN_ROUTE);
+ if (rv == 0)
+ s->rt = r->rt;
return (rv);
}
@@ -5986,15 +5965,13 @@ pf_rtlabel_match(struct pf_addr *addr, s
/* pf_route() may change pd->m, adjust local copies after calling */
void
-pf_route(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s)
+pf_route(struct pf_pdesc *pd, struct pf_state *s)
{
struct mbuf *m0, *m1;
struct sockaddr_in *dst, sin;
struct rtentry *rt = NULL;
struct ip *ip;
struct ifnet *ifp = NULL;
- struct pf_addr naddr;
- struct pf_src_node *sns[PF_SN_MAX];
int error = 0;
unsigned int rtableid;
@@ -6004,11 +5981,11 @@ pf_route(struct pf_pdesc *pd, struct pf_
return;
}
- if (r->rt == PF_DUPTO) {
+ if (s->rt == PF_DUPTO) {
if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL)
return;
} else {
- if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir))
+ if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir))
return;
m0 = pd->m;
}
@@ -6021,44 +5998,31 @@ pf_route(struct pf_pdesc *pd, struct pf_
ip = mtod(m0, struct ip *);
- memset(&sin, 0, sizeof(sin));
- dst = &sin;
- dst->sin_family = AF_INET;
- dst->sin_len = sizeof(*dst);
- dst->sin_addr = ip->ip_dst;
- rtableid = m0->m_pkthdr.ph_rtableid;
-
if (pd->dir == PF_IN) {
if (ip->ip_ttl <= IPTTLDEC) {
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pf_send_icmp(m0, ICMP_TIMXCEED,
ICMP_TIMXCEED_INTRANS, 0,
- pd->af, r, pd->rdomain);
+ pd->af, s->rule.ptr, pd->rdomain);
goto bad;
}
ip->ip_ttl -= IPTTLDEC;
}
- if (s == NULL) {
- memset(sns, 0, sizeof(sns));
- if (pf_map_addr(AF_INET, r,
- (struct pf_addr *)&ip->ip_src,
- &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) {
- DPFPRINTF(LOG_ERR,
- "%s: pf_map_addr() failed", __func__);
- goto bad;
- }
+ memset(&sin, 0, sizeof(sin));
+ dst = &sin;
+ dst->sin_family = AF_INET;
+ dst->sin_len = sizeof(*dst);
+ dst->sin_addr.s_addr = s->rt_addr.v4.s_addr;
+ rtableid = m0->m_pkthdr.ph_rtableid;
- if (!PF_AZERO(&naddr, AF_INET))
- dst->sin_addr.s_addr = naddr.v4.s_addr;
- ifp = r->route.kif ?
- r->route.kif->pfik_ifp : NULL;
- } else {
- if (!PF_AZERO(&s->rt_addr, AF_INET))
- dst->sin_addr.s_addr =
- s->rt_addr.v4.s_addr;
- ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
+ rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid);
+ if (!rtisvalid(rt)) {
+ ipstat_inc(ips_noroute);
+ goto bad;
}
+
+ ifp = if_get(rt->rt_ifidx);
if (ifp == NULL)
goto bad;
@@ -6074,12 +6038,6 @@ pf_route(struct pf_pdesc *pd, struct pf_
}
ip = mtod(m0, struct ip *);
}
-
- rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid);
- if (!rtisvalid(rt)) {
- ipstat_inc(ips_noroute);
- goto bad;
- }
/* A locally generated packet may have invalid source address. */
if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET &&
(ifp->if_flags & IFF_LOOPBACK) == 0)
@@ -6105,9 +6063,9 @@ pf_route(struct pf_pdesc *pd, struct pf_
*/
if (ip->ip_off & htons(IP_DF)) {
ipstat_inc(ips_cantfrag);
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG,
- ifp->if_mtu, pd->af, r, pd->rdomain);
+ ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain);
goto bad;
}
@@ -6131,8 +6089,9 @@ pf_route(struct pf_pdesc *pd, struct pf_
ipstat_inc(ips_fragmented);
done:
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pd->m = NULL;
+ if_put(ifp);
rtfree(rt);
return;
@@ -6144,15 +6103,13 @@ bad:
#ifdef INET6
/* pf_route6() may change pd->m, adjust local copies after calling */
void
-pf_route6(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s)
+pf_route6(struct pf_pdesc *pd, struct pf_state *s)
{
struct mbuf *m0;
struct sockaddr_in6 *dst, sin6;
struct rtentry *rt = NULL;
struct ip6_hdr *ip6;
struct ifnet *ifp = NULL;
- struct pf_addr naddr;
- struct pf_src_node *sns[PF_SN_MAX];
struct m_tag *mtag;
unsigned int rtableid;
@@ -6162,11 +6119,11 @@ pf_route6(struct pf_pdesc *pd, struct pf
return;
}
- if (r->rt == PF_DUPTO) {
+ if (s->rt == PF_DUPTO) {
if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL)
return;
} else {
- if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir))
+ if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir))
return;
m0 = pd->m;
}
@@ -6178,42 +6135,31 @@ pf_route6(struct pf_pdesc *pd, struct pf
}
ip6 = mtod(m0, struct ip6_hdr *);
- memset(&sin6, 0, sizeof(sin6));
- dst = &sin6;
- dst->sin6_family = AF_INET6;
- dst->sin6_len = sizeof(*dst);
- dst->sin6_addr = ip6->ip6_dst;
- rtableid = m0->m_pkthdr.ph_rtableid;
-
if (pd->dir == PF_IN) {
if (ip6->ip6_hlim <= IPV6_HLIMDEC) {
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pf_send_icmp(m0, ICMP6_TIME_EXCEEDED,
ICMP6_TIME_EXCEED_TRANSIT, 0,
- pd->af, r, pd->rdomain);
+ pd->af, s->rule.ptr, pd->rdomain);
goto bad;
}
ip6->ip6_hlim -= IPV6_HLIMDEC;
}
- if (s == NULL) {
- memset(sns, 0, sizeof(sns));
- if (pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src,
- &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) {
- DPFPRINTF(LOG_ERR,
- "%s: pf_map_addr() failed", __func__);
- goto bad;
- }
- if (!PF_AZERO(&naddr, AF_INET6))
- pf_addrcpy((struct pf_addr *)&dst->sin6_addr,
- &naddr, AF_INET6);
- ifp = r->route.kif ? r->route.kif->pfik_ifp : NULL;
- } else {
- if (!PF_AZERO(&s->rt_addr, AF_INET6))
- pf_addrcpy((struct pf_addr *)&dst->sin6_addr,
- &s->rt_addr, AF_INET6);
- ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
+ memset(&sin6, 0, sizeof(sin6));
+ dst = &sin6;
+ dst->sin6_family = AF_INET6;
+ dst->sin6_len = sizeof(*dst);
+ pf_addrcpy((struct pf_addr *)&dst->sin6_addr, &s->rt_addr, AF_INET6);
+ rtableid = m0->m_pkthdr.ph_rtableid;
+
+ rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid);
+ if (!rtisvalid(rt)) {
+ ip6stat_inc(ip6s_noroute);
+ goto bad;
}
+
+ ifp = if_get(rt->rt_ifidx);
if (ifp == NULL)
goto bad;
@@ -6231,11 +6177,7 @@ pf_route6(struct pf_pdesc *pd, struct pf
if (IN6_IS_SCOPE_EMBED(&dst->sin6_addr))
dst->sin6_addr.s6_addr16[1] = htons(ifp->if_index);
- rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid);
- if (!rtisvalid(rt)) {
- ip6stat_inc(ip6s_noroute);
- goto bad;
- }
+
/* A locally generated packet may have invalid source address. */
if (IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) &&
(ifp->if_flags & IFF_LOOPBACK) == 0)
@@ -6253,15 +6195,16 @@ pf_route6(struct pf_pdesc *pd, struct pf
ifp->if_output(ifp, m0, sin6tosa(dst), rt);
} else {
ip6stat_inc(ip6s_cantfrag);
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pf_send_icmp(m0, ICMP6_PACKET_TOO_BIG, 0,
- ifp->if_mtu, pd->af, r, pd->rdomain);
+ ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain);
goto bad;
}
done:
- if (r->rt != PF_DUPTO)
+ if (s->rt != PF_DUPTO)
pd->m = NULL;
+ if_put(ifp);
rtfree(rt);
return;
@@ -6271,7 +6214,6 @@ bad:
}
#endif /* INET6 */
-
/*
* check TCP checksum and set mbuf flag
* off is the offset where the protocol header starts
@@ -7287,14 +7229,14 @@ done:
pd.m = NULL;
break;
default:
- if (r->rt) {
+ if (s && s->rt) {
switch (pd.af) {
case AF_INET:
- pf_route(&pd, r, s);
+ pf_route(&pd, s);
break;
#ifdef INET6
case AF_INET6:
- pf_route6(&pd, r, s);
+ pf_route6(&pd, s);
break;
#endif /* INET6 */
}
Index: sys/net/pfvar.h
===================================================================
RCS file: /cvs/src/sys/net/pfvar.h,v
retrieving revision 1.496
diff -u -p -r1.496 pfvar.h
--- sys/net/pfvar.h 24 Aug 2020 15:30:58 -0000 1.496
+++ sys/net/pfvar.h 19 Oct 2020 23:19:22 -0000
@@ -133,7 +133,7 @@ enum { PFTM_TCP_FIRST_PACKET, PFTM_TCP_O
*/
#define PF_FRAG_ENTRY_LIMIT 64
-enum { PF_NOPFROUTE, PF_ROUTETO, PF_DUPTO, PF_REPLYTO };
+enum { PF_NOPFROUTE = 0 , PF_ROUTETO, PF_DUPTO, PF_REPLYTO };
enum { PF_LIMIT_STATES, PF_LIMIT_SRC_NODES, PF_LIMIT_FRAGS,
PF_LIMIT_TABLES, PF_LIMIT_TABLE_ENTRIES, PF_LIMIT_PKTDELAY_PKTS,
PF_LIMIT_MAX };
@@ -762,7 +762,6 @@ struct pf_state {
struct pf_sn_head src_nodes;
struct pf_state_key *key[2]; /* addresses stack and wire */
struct pfi_kif *kif;
- struct pfi_kif *rt_kif;
u_int64_t packets[2];
u_int64_t bytes[2];
int32_t creation;
@@ -797,6 +796,7 @@ struct pf_state {
u_int16_t if_index_out;
pf_refcnt_t refcnt;
u_int16_t delay;
+ u_int8_t rt;
};
/*
@@ -852,7 +852,7 @@ struct pfsync_state {
u_int8_t proto;
u_int8_t direction;
u_int8_t log;
- u_int8_t pad0;
+ u_int8_t rt;
u_int8_t timeout;
u_int8_t sync_flags;
u_int8_t updates;
@@ -1798,8 +1798,8 @@ int pf_state_key_attach(struct pf_state_
int pf_translate(struct pf_pdesc *, struct pf_addr *, u_int16_t,
struct pf_addr *, u_int16_t, u_int16_t, int);
int pf_translate_af(struct pf_pdesc *);
-void pf_route(struct pf_pdesc *, struct pf_rule *, struct pf_state *);
-void pf_route6(struct pf_pdesc *, struct pf_rule *, struct pf_state *);
+void pf_route(struct pf_pdesc *, struct pf_state *);
+void pf_route6(struct pf_pdesc *, struct pf_state *);
void pf_init_threshold(struct pf_threshold *, u_int32_t, u_int32_t);
int pf_delay_pkt(struct mbuf *, u_int);