Fernando Fernandez Mancera <[email protected]> wrote:
> Since commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary
> GC") if nftables/iptables connlimit is used without a check for new
> connections there can be duplicated entries tracked.
>
> Pass the nf_conn struct directly to the nf_conncount API and check
> whether the connection is confirmed or not inside nf_conncount_add(). If
> the connection is confirmed, skip it.
I think there is a bit too much noise here, can this be
split in several chunks?
> -unsigned int nf_conncount_count(struct net *net,
> +unsigned int nf_conncount_count(struct net *net, const struct nf_conn *ct,
> struct nf_conncount_data *data,
> - const u32 *key,
> - const struct nf_conntrack_tuple *tuple,
> - const struct nf_conntrack_zone *zone);
> + const u32 *key);
nf_conn *ct has pitfalls that I did not realize earlier.
Current code is also buggy in that regard.
Maybe we need additional function to ease this for callers.
See below.
> +static int __nf_conncount_add(const struct nf_conn *ct,
> + struct nf_conncount_list *list)
> {
> const struct nf_conntrack_tuple_hash *found;
> struct nf_conncount_tuple *conn, *conn_n;
> + const struct nf_conntrack_tuple *tuple;
> + const struct nf_conntrack_zone *zone;
> struct nf_conn *found_ct;
> unsigned int collect = 0;
>
> + if (!ct || nf_ct_is_confirmed(ct))
> + return -EINVAL;
> +
When is the caller expected to pass a NULL ct?
Maybe this just misses a comment.
> index fc35a11cdca2..e815c0235b62 100644
> --- a/net/netfilter/nft_connlimit.c
> +++ b/net/netfilter/nft_connlimit.c
> @@ -24,28 +24,35 @@ static inline void nft_connlimit_do_eval(struct
> nft_connlimit *priv,
> const struct nft_pktinfo *pkt,
> const struct nft_set_ext *ext)
> {
> - const struct nf_conntrack_zone *zone = &nf_ct_zone_dflt;
> - const struct nf_conntrack_tuple *tuple_ptr;
> + struct nf_conntrack_tuple_hash *h;
> struct nf_conntrack_tuple tuple;
> enum ip_conntrack_info ctinfo;
> const struct nf_conn *ct;
> unsigned int count;
> -
> - tuple_ptr = &tuple;
> + int err;
>
> ct = nf_ct_get(pkt->skb, &ctinfo);
> - if (ct != NULL) {
> - tuple_ptr = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
> - zone = nf_ct_zone(ct);
> - } else if (!nf_ct_get_tuplepr(pkt->skb, skb_network_offset(pkt->skb),
> - nft_pf(pkt), nft_net(pkt), &tuple)) {
> - regs->verdict.code = NF_DROP;
> - return;
> + if (!ct) {
> + if (!nf_ct_get_tuplepr(pkt->skb, skb_network_offset(pkt->skb),
> + nft_pf(pkt), nft_net(pkt), &tuple)) {
> + regs->verdict.code = NF_DROP;
> + return;
> + }
> +
> + h = nf_conntrack_find_get(nft_net(pkt), &nf_ct_zone_dflt,
> &tuple);
> + if (!h) {
> + regs->verdict.code = NF_DROP;
> + return;
> + }
> + ct = nf_ct_tuplehash_to_ctrack(h);
This ct had its refcount incremented.
I also see that this shared copypaste with xtables.
Would it be possible to pass 'const struct sk_buff *'
and let the nf_conncount core handle this internally?
nf_conncount_add(net, pf, skb, priv->list);
which does:
ct = nf_ct_get(skb, &ctinfo);
if (ct && !nf_ct_is_template(ct))
return __nf_conncount_add(ct, list);
if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), pf, net,
&tuple))
return -ERR;
if (ct) /* its a template, so do lookup in right zone */
zone = nf_ct_zone(ct);
else
zone = &nf_ct_zone_dflt;
h = nf_conntrack_find_get(nft_net(pkt), zone, &tuple);
if (!h)
return -ERR;
ct = nf_ct_tuplehash_to_ctrack(h);
err = __nf_conncount_add(ct, list);
nf_ct_put(ct):
return err;
I.e., the existing nf_conncount_add() becomes a helper that takes
a ct, as you have already proposed, but its renamed and turned into
an internal helper so frontends don't need to duplicate tuple lookup.
Alternatively, no need to rename it and instead add a new API call
that takes the ct argument, e.g. 'nf_conncount_add_ct' or whatever,
and then make nf_conncount_add() internal in a followup patch.
> + h = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
We should not restrict the lookup to the default zone,
but we should follow what the template (if any) says.
I under-estimated the work to get this right, I think this is too much
code to copypaste between nft+xt frontends. Sorry for making that
suggestion originally.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev