On Thu, May 21, 2015 at 21:28 +0200, Alexandr Nedvedicky wrote:
> Hello,
>
> On Thu, May 21, 2015 at 07:43:51PM +0200, Mike Belopuhov wrote:
> > On Thu, May 21, 2015 at 17:34 +0200, Alexandr Nedvedicky wrote:
> > > Hello,
> > >
> >
> > Hi,
> >
> > > snippet below comes from pf_create_state():
> > >
> > > 3559 if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks,
> > > s)) {
> > > 3560 pf_state_key_detach(s, PF_SK_STACK);
> > > 3561 pf_state_key_detach(s, PF_SK_WIRE);
> > > 3562 *sks = *skw = NULL;
> > > 3563 REASON_SET(&reason, PFRES_STATEINS);
> > > 3564 goto csfailed;
> > > 3565 } else
> > > 3566 *sm = s;
> > > 3567
> > > 3568 /* attach src nodes late, otherwise cleanup on error
> > > nontrivial */
> > > 3569 for (i = 0; i < PF_SN_MAX; i++)
> > > 3570 if (sns[i] != NULL) {
> > > 3571 struct pf_sn_item *sni;
> > > 3572
> > > 3573 sni = pool_get(&pf_sn_item_pl, PR_NOWAIT);
> > > 3574 if (sni == NULL) {
> > > 3575 REASON_SET(&reason, PFRES_MEMORY);
> > > 3576 pf_src_tree_remove_state(s);
> > > 3577 STATE_DEC_COUNTERS(s);
> > > 3578 pool_put(&pf_state_pl, s);
> > > 3579 return (PF_DROP);
> > > 3580 }
> > > 3581 sni->sn = sns[i];
> > > 3582 SLIST_INSERT_HEAD(&s->src_nodes, sni,
> > > next);
> > > 3583 sni->sn->states++;
> > > 3584 }
> > >
> > > at line 3559 PF inserts state to table. If insert operation succeeds, then
> > > state can no longer be killed using simple pool_put() as it currently
> > > happens at line 3578. I think PF should go for pf_unlink_state() instead.
> > >
> > > patch below should kill the bug.
> > >
> >
> > Indeed. But I don't like the comment stating that we're attaching
> > src nodes late because the "cleanup on error nontrivial". Perhaps
> > we should do a pf_state_insert afterwards? This might simplify
> > locking later on.
>
> perhaps swapping the for loop block with pf_state_insert() will work.
> We can then bail out using goto csfailed then (see patch below...)
>
Yup.
> > we should do a pf_state_insert afterwards? This might simplify
> > locking later on.
>
> speaking of locking... everything is more complicated, however
> in this particular case it makes things easier for sure. basically in
> our current SMP model, once we insert state to table, the only way
> to remove it is to leave the job on garbage collector thread...
>
Right.
> >
> > > also one more off-topic question:
> > >
> > > would you be interested in SMP patch for PF?
> > > it basically introduces fine locking and reference counting
> > > on PF data objects, so firewall can handle more packets at
> > > single instance of time.
> > >
> >
> > We would definitely be interested in such a diff, but at the moment
> > there's no use for it as pf_test is called directly from the IP stack
> > and hence IP stack needs to support parallel execution first. This
> > doesn't mean that we can't start integrating bits and pieces, esp.
> > if they're presented as separate patches.
>
> we use a compile time switch to enable SMP code (-D_PF_SMP_). this is
> something I'd like to keep around anyway. I'll start on syncing up our changes
> with CURRENT. It will take couple of weeks (ENOCYCLES as usually). Once
> I'll have something to show I'll share it and you'll see what can be done
> with it.
>
Cool. Looking forward to!
> >
> > Thanks for your quality diffs btw, help is always much appreciated.
> >
> I'm just trying to be useful. It's pleasure to work with PF sources.
>
> thanks and
> regards
> sasha
>
Unfortunately this diff has two nasty bugs (one is not even yours).
> ===================================================================
> RCS file: /cvs/src/sys/net/pf.c,v
> retrieving revision 1.913
> diff -u -r1.913 pf.c
> --- pf.c 11 May 2015 12:22:14 -0000 1.913
> +++ pf.c 21 May 2015 19:18:04 -0000
> @@ -3556,15 +3556,6 @@
> goto csfailed;
> }
>
> - if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) {
> - pf_state_key_detach(s, PF_SK_STACK);
> - pf_state_key_detach(s, PF_SK_WIRE);
This bug is not yours, but doing two pf_state_key_detach is wrong
and results in all kinds of protection fault fireworks. The right
thing is to do pf_detach_state that would handle the case where
PF_SK_STACK == PF_SK_WIRE (i.e. we need only one invocation).
> - *sks = *skw = NULL;
> - REASON_SET(&reason, PFRES_STATEINS);
> - goto csfailed;
> - } else
> - *sm = s;
> -
> /* attach src nodes late, otherwise cleanup on error nontrivial */
> for (i = 0; i < PF_SN_MAX; i++)
> if (sns[i] != NULL) {
> @@ -3572,16 +3563,21 @@
>
> sni = pool_get(&pf_sn_item_pl, PR_NOWAIT);
> if (sni == NULL) {
> - REASON_SET(&reason, PFRES_MEMORY);
REASON_SET needs to stay.
> - pf_src_tree_remove_state(s);
> - STATE_DEC_COUNTERS(s);
> - pool_put(&pf_state_pl, s);
> - return (PF_DROP);
> + goto csfailed;
> }
> sni->sn = sns[i];
> SLIST_INSERT_HEAD(&s->src_nodes, sni, next);
> sni->sn->states++;
> }
> +
> + if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) {
> + pf_state_key_detach(s, PF_SK_STACK);
> + pf_state_key_detach(s, PF_SK_WIRE);
> + *sks = *skw = NULL;
> + REASON_SET(&reason, PFRES_STATEINS);
> + goto csfailed;
> + } else
> + *sm = s;
>
> pf_set_rt_ifp(s, pd->src); /* needs s->state_key set */
> if (tag > 0) {
>
In csfailed case we do pf_remove_src_node and then call
pf_src_tree_remove_state. With your diff this means that
pf_src_tree_remove_state will be dereferencing sni->sn that
might be pool_put by the pf_remove_src_node. Therefore their
order needs to be reversed.
I did some debugging with this version and it feels solid.
OKs are welcome.
diff --git sys/net/pf.c sys/net/pf.c
index a5c902c..bc1c9e4 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -3555,37 +3555,32 @@ pf_create_state(struct pf_pdesc *pd, struct pf_rule *r,
struct pf_rule *a,
if (pf_state_key_setup(pd, skw, sks, act->rtableid)) {
REASON_SET(&reason, PFRES_MEMORY);
goto csfailed;
}
- if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) {
- pf_state_key_detach(s, PF_SK_STACK);
- pf_state_key_detach(s, PF_SK_WIRE);
- *sks = *skw = NULL;
- REASON_SET(&reason, PFRES_STATEINS);
- goto csfailed;
- } else
- *sm = s;
-
- /* attach src nodes late, otherwise cleanup on error nontrivial */
for (i = 0; i < PF_SN_MAX; i++)
if (sns[i] != NULL) {
struct pf_sn_item *sni;
sni = pool_get(&pf_sn_item_pl, PR_NOWAIT);
if (sni == NULL) {
REASON_SET(&reason, PFRES_MEMORY);
- pf_src_tree_remove_state(s);
- STATE_DEC_COUNTERS(s);
- pool_put(&pf_state_pl, s);
- return (PF_DROP);
+ goto csfailed;
}
sni->sn = sns[i];
SLIST_INSERT_HEAD(&s->src_nodes, sni, next);
sni->sn->states++;
}
+ if (pf_state_insert(BOUND_IFACE(r, pd->kif), skw, sks, s)) {
+ pf_detach_state(s);
+ *sks = *skw = NULL;
+ REASON_SET(&reason, PFRES_STATEINS);
+ goto csfailed;
+ } else
+ *sm = s;
+
pf_set_rt_ifp(s, pd->src); /* needs s->state_key set */
if (tag > 0) {
pf_tag_ref(tag);
s->tag = tag;
}
@@ -3610,16 +3605,20 @@ pf_create_state(struct pf_pdesc *pd, struct pf_rule *r,
struct pf_rule *a,
}
return (PF_PASS);
csfailed:
+ if (s) {
+ pf_normalize_tcp_cleanup(s); /* safe even w/o init */
+ pf_src_tree_remove_state(s);
+ }
+
for (i = 0; i < PF_SN_MAX; i++)
if (sns[i] != NULL)
pf_remove_src_node(sns[i]);
+
if (s) {
- pf_normalize_tcp_cleanup(s); /* safe even w/o init */
- pf_src_tree_remove_state(s);
STATE_DEC_COUNTERS(s);
pool_put(&pf_state_pl, s);
}
return (PF_DROP);